期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2014
卷号:11
期号:1
出版社:IJCSI Press
摘要:Named Entity Recognition (NER) and Classification becomes more and more important in many natural language processing applications. It helps machine to recognize named entities in text and assign them with the appropriate categories. NER for Telugu is a challenging task since Telugu is very rich in morphology. Recent systems rely on machine learning approaches, but their performance is highly dependent on size and quality of training data. In this paper we proposed a rule based Named Entity Recognition and Classification system for Telugu language. In this paper we describe the identification and classification of Named Entities using word level features, work lookup features and contextual features. Further classification of identified Named Entities and ambiguity resolution is done through contextual rules and syntax information. The System is tested on different data sets of News paper and Teluguwiki corpus.
关键词:Heuristics; Named Entity; Gazetteers; Morphology.