期刊名称:Journal of King Saud University @?C Computer and Information Sciences
印刷版ISSN:1319-1578
出版年度:2017
卷号:29
期号:2
页码:147-155
DOI:10.1016/j.jksuci.2017.01.004
出版社:Elsevier
摘要:In this paper, we propose a method to disambiguate the output of a morphological analyzer of the Tunisian dialect. We test three machine-learning techniques that classify the morphological analysis of each word token into two classes: true and false. The class label is assigned to each analysis according to the context of the corresponding word in a sentence. In failure cases, we combine the results of the proposed techniques with a bigram classifier to choose only one analysis for a given word. We disambiguate the result of the morphological analyzer of the Tunisian Dialect Al-Khalil-TUN (Zribi et al., 2013b). We use the Spoken Tunisian Arabic Corpus STAC (Zribi et al., 2015) to train and test our method. The evaluation shows that the proposed method has achieved an accuracy performance of 87.32%.