期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2012
卷号:44
期号:1
页码:001-006
出版社:Journal of Theoretical and Applied
摘要:Named entity recognition (NER) systems aim to automatically identify and classify the proper nouns in text. NER systems play a significant role in many areas of Natural Language Processing (NLP) such as question answering systems, text summarization and information retrieval. Unlike previous Arabic NER systems which have been built to extract named entities from general Arabic text, our task involves extracting named entities from crime documents. Extracting named entities from crime text provides basic information for crime analysis. This paper presents a rule-based approach to Arabic NER system relevant to the crime domain. Based on morphological information, predefined crime and general indicator lists and an Arabic named entity annotation corpus from crime domain, several syntactical rules and patterns of Arabic NER are induced and then formalized. Then, these rules and patterns are applied to identify and classify named entities in Arabic crime text. The result shows that the accuracy of our system is 90%, and this result indicates that the method is effective and the performance of the achieved system is satisfactory.
关键词:Natural Language Processing; Named Entity Recognition; Arabic Crime Documents