期刊名称:The International Arab Journal of Information Technology
印刷版ISSN:1683-3198
出版年度:2011
卷号:8
期号:4
出版社:Zarqa Private University
摘要:Text tagging is a very important tool for various applications in natural language processing, namely the morphological and syntactic analysis of texts, indexation and information retrieval, "vocalization" of Arabic texts, and probabilistic language model (n-class model). However, these systems based on the lexemes of limited size, are unable to treat unknown words consequently. To overcome this problem, we developed in this paper, a new system based on the patterns of unknown words and the hidden Markov model. The experiments are carried out in the set of labeled texts, the set of 3800 patterns, and the 52 tags of morpho-syntactic nature, to estimate the parameters of the new model HMM.
关键词:Hidden markov model; morpho-syntactic tagging; Arabic text; and pattern