期刊名称:International Journal of Electrical and Computer Engineering
电子版ISSN:2088-8708
出版年度:2020
卷号:10
期号:3
页码:3244-3252
DOI:10.11591/ijece.v10i3.pp3244-3252
出版社:Institute of Advanced Engineering and Science (IAES)
摘要:N-gram based language models are very popular and extensively used statistical methods for solving various natural language processing problems including grammar checking. Smoothing is one of the most effective techniques used in building a language model to deal with data sparsity problem. Kneser-Ney is one of the most prominently used and successful smoothing technique for language modelling. In our previous work, we presented a Witten-Bell smoothing based language modelling technique for checking grammatical correctness of Bangla sentences which showed promising results outperforming previous methods. In this work, we proposed an improved method using Kneser-Ney smoothing based n-gram language model for grammar checking and performed a comparative performance analysis between Kneser-Ney and Witten-Bell smoothing techniques for the same purpose. We also provided an improved technique for calculating the optimum threshold which further enhanced the the results. Our experimental results show that, Kneser-Ney outperforms Witten-Bell as a smoothing technique when used with n-gram LMs for checking grammatical correctness of Bangla sentences.
关键词:Grammar checking;Language models;Natural language processing;N-grams;Smoothing