首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Named Entity Recognition using Word Embedding as a Feature
  • 本地全文:下载
  • 作者:Miran Seok ; Hye-Jeong Song ; Chan-Young Park
  • 期刊名称:International Journal of Software Engineering and Its Applications
  • 印刷版ISSN:1738-9984
  • 出版年度:2016
  • 卷号:10
  • 期号:2
  • 页码:93-104
  • DOI:10.14257/ijseia.2016.10.2.08
  • 出版社:SERSC
  • 摘要:This study applied word embedding to feature for named entity recognition (NER) training, and used CRF as a learning algorithm. Named entities are phrases that contain the names of persons, organizations and locations and recognizing these entities in text is one of the important task of information extraction. Word embedding is helpful in many learning algorithms of NLP, indicating that words in a sentence are mapped by a real vector in a low-dimension space. We used GloVe, Word2Vec, and CCA as the embedding methods. The Reuters Corpus Volume 1 was used to create word embedding and the 2003 shared task corpus (English) of CoNLL was used for training and testing. As a result of comparing the performance of multiple techniques for word embedding to NER, it was found that CCA (85.96%) in Test A and Word2Vec (80.72%) in Test B exhibited the best performance. When using the word embedding as a feature of NER, it is possible to obtain better results than baseline that do not use word embedding. Also, to check that the word embedding well performed, we did additional experiment calculating the similarity between words.
  • 关键词:Natural Language Processing; Named Entity Recognition; Word ; Embedding
国家哲学社会科学文献中心版权所有