文章基本信息

标题：A Biomedical Named Entity Recognition Using Machine Learning Classifiers and Rich Feature Set
本地全文：下载
作者：Ahmed Sultan Al-Hegami ; Ameen Mohammed Farea Othman ; Fuad Tarbosh Bagash 等
期刊名称：International Journal of Computer Science and Network Security
印刷版ISSN：1738-7906
出版年度：2017
卷号：17
期号：1
页码：170-176
出版社：International Journal of Computer Science and Network Security
摘要：As the wealth of biomedical knowledge in the form of literature increases, there is a rising need for effective natural language processing tools to assist in organizing, curating, and retrieving this information. The task of named entity recognition becomes more difficult from specific domain since entities are more exact to that particular domain. To that end, named entity recognition (the task of identifying words and phrases in free text that belong to certain classes of interest) is an important first step for many of these larger information management goals. In recent years, much attention has been focused on the problem of recognizing gene and protein and other biomedical entities mentions in biomedical abstracts. Thus, this study aims to design and develop a biomedical named entity recognition model. A machine learning classification framework is proposed based on Na?ve Bayes, K-Nearest Neighbour and decision tree classifiers. we have performed several experiments to empirically compare different subsets of features and three classification approach Na?ve Bayes, K-Nearest Neighbour and decision tree for biomedical named entity recognition. The aim is to efficiently integrate different feature sets and classification algorithms to synthesize a more accurate classification procedure. Results prove that the K-Nearest Neighbour trained with suitable features is more suitable to recognize named entities of biomedical texts than other models.
关键词：Named entity recognition (NER); learning; classification; framework; decision tree; recognizing gene; Na?ve Bayes; K-Nearest Neighbour.