首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:An Investigative Design Based Statistical Approach for Determining Bangla Sentence Validity
  • 本地全文:下载
  • 作者:Md. Riazur Rahman ; Md. Tarek Habib ; Md. Sadekur Rahman
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2016
  • 卷号:16
  • 期号:11
  • 页码:30-37
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:Automatic grammatical verification of sentences is an essential task in natural language processing. There has been a scarcity of resources in Bangla for such tasks. To address this issue this paper presents a new n-gram based statistical approach to check the syntactic and semantic correctness of sentences in Bangla. An n-gram frequency count-based probabilistic language model is employed combining standard n-gram statistics with appropriate smoothing and advanced backoff language model to detect validity of any sentence in Bangla to design the proposed method. A new Bangla corpus of 10 million words is used to train the proposed method. The system was tested on both valid and invalid sentences collected separately from training corpus. In terms of detecting correct and incorrect sentences the proposed system achieved 82% precision and 81% recall scores outperforming the existing systems.
  • 关键词:sentence validity detection natural language processing n-gram smoothing backoff strategy language model..
国家哲学社会科学文献中心版权所有