期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2016
卷号:84
期号:3
出版社:Journal of Theoretical and Applied
摘要:There is a need to retrieve and extract important information in order to fully understanding the ever-increasing volume of English translated Islamic documents available on the web. There is limited research focused on Named Entity Recognition (NER) for Islamic translations even though NER has seen widespread focus in other languages. Translated named entities have their own characteristics and available annotated English corpora do not cover all the transliterated Arabic names, which makes translations with NER difficult in the Islamic domain. This research addressed the use of NER in English translations of Hadith texts. The objective of this research was to design and develop a model that was able to excerpt Named Entities from English translation of Hadith texts. This research used supervised machine learning approaches, like Support Vector Machine (SVM), Maximum Entropy Classifier (ME) and Naive Bayes (NB), which were later combined via majority voting algorithm to identify named entities from Hadith texts. From the results of this research, voting combination approaches outmatched single classifiers with an overall F-measure of 95.3% in identifying named entities. The results indicated that combined models paired with suitable features were better suited to recognize named entities of translated Hadith texts as compared to baseline models.
关键词:Named Entity Recognition; supervised machine learning; Hadith text