首页    期刊浏览 2025年02月23日 星期日
登录注册

文章基本信息

  • 标题:Eigennamenerkennung zwischen morphologischer Analyse und Part-of-Speech Tagging: ein automatentheoriebasierter Ansatz
  • 本地全文:下载
  • 作者:Jörg Didakowski ; Alexander Geyken ; Thomas Hanneforth
  • 期刊名称:Zeitschrift für Sprachwissenschaft
  • 印刷版ISSN:0721-9067
  • 电子版ISSN:1613-3706
  • 出版年度:2007
  • 卷号:26
  • 期号:2
  • 页码:157-186
  • DOI:10.1515/ZFS.2007.016
  • 出版社:Walter de Gruyter GmbH
  • 摘要:Previous rule-based approaches for Named Entity Recognition (NER) in German base NER on Part-of-Speech tagged texts. We present a new approach where NER is situated between morphological analysis and Part-of-Speech Tagging and model the NER-grammar entirely with weighted finite state transducers (WFST). We show that NER strategies like the resolution of proper noun/common noun or company-name/family-name ambiguities can be formulated as a best path function of a WFST. The frequently used second pass resolution of coreferential Named Entities can be formulated as a re-assignment of appropriate weights. A prototypical NE recognition system built on the basis of WSFT and large lexical resources was tested on a manually annotated corpus of 65,000 tokens. The results show that our system compares in recall and precision to existing rule-based approaches.
  • 关键词:Named Entity Recognition ; weighted finite state transducers ; large lexical resources
国家哲学社会科学文献中心版权所有