首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:OGER++: hybrid multi-type entity recognition
  • 作者:Lenz Furrer ; Lenz Furrer ; Anna Jancso
  • 期刊名称:Journal of Cheminformatics
  • 印刷版ISSN:1758-2946
  • 电子版ISSN:1758-2946
  • 出版年度:2019
  • 卷号:11
  • 期号:1
  • 页码:7
  • DOI:10.1186/s13321-018-0326-3
  • 语种:English
  • 出版社:BioMed Central
  • 摘要:We present a text-mining tool for recognizing biomedical entities in scientific literature. OGER++ is a hybrid system for named entity recognition and concept recognition (linking), which combines a dictionary-based annotator with a corpus-based disambiguation component. The annotator uses an efficient look-up strategy combined with a normalization method for matching spelling variants. The disambiguation classifier is implemented as a feed-forward neural network which acts as a postfilter to the previous step. We evaluated the system in terms of processing speed and annotation quality. In the speed benchmarks, the OGER++ web service processes 9.7 abstracts or 0.9 full-text documents per second. On the CRAFT corpus, we achieved 71.4% and 56.7% F1 for named entity recognition and concept recognition, respectively. Combining knowledge-based and data-driven components allows creating a system with competitive performance in biomedical text mining.
  • 关键词:Named entity recognition ; Concept recognition ; Natural language processing ; Machine learning
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有