期刊名称:International Journal on Computer Science and Engineering
印刷版ISSN:2229-5631
电子版ISSN:0975-3397
出版年度:2011
卷号:3
期号:1
页码:410-422
出版社:Engg Journals Publications
摘要:Entity extraction is considered as a fundamental step in many text mining applications such as machine translation, text summarization and text categorization. However, the major challenging issue in extracting the entity from a sentence is the ambiguity problem, namely lexical ambiguity. While a human has a cognitive capability to resolve the meaning easily based on his/her knowledge, it is very difficult for a machine to do so. This paper proposed a new technique for resolving the ambiguity problem through a fuzzy approach and context knowledge. The technique integrates subject and lexical knowledge, the possibility theory, and fuzzy sets into natural language processing. Lexical knowledge was obtained from WordNet, while subject and lexical knowledge have been deployed as context knowledge. Possibility theory and fuzzy sets were applied to select the most possible meaning of an ambiguous entity based on the context. The work was conducted on the noun part-of-speech only. The technique was implemented and tested with 1110 sentences. Precision and recall measurement metrics were used as an evaluation metric. The obtained precision rate is 85.7% and 80.3% for recall. The results indicate that the proposed technique is successful. (
关键词:natural language processing; ambiguity; context knowledge; fuzzy approach; information extraction