首页    期刊浏览 2024年10月01日 星期二
登录注册

文章基本信息

  • 标题:NLP Web Services for Slovene and English: Morphosyntactic Tagging, Lemmatisation and Definition Extraction
  • 本地全文:下载
  • 作者:Senja Pollak ; Nejc Trdin ; Anže Vavpetič.Tomaž Erjavec
  • 期刊名称:Informatica
  • 印刷版ISSN:1514-8327
  • 电子版ISSN:1854-3871
  • 出版年度:2012
  • 卷号:36
  • 期号:4
  • 出版社:The Slovene Society Informatika, Ljubljana
  • 摘要:This paper presents a web service for automatic linguistic annotation of Slovene and English texts. The web service enables text up-loading in a number of different input formats, and then converts, tokenises, tags and lemmatises the text, and returns the annotated text. The paper presents the ToTrTaLe annotation tool, and the implementation of the annotation workflow in two workflow construction environments, Orange4WS and ClowdFlows. It also proposes several improvements to the annotation tool based on the identification of various types of errors of the existing ToTrTaLe tool, and implements these improvements as a post-processing step in the workflow. The workflows enable the users to incorporate the annotation service as an elementary constituent for other natural language processing workflows, as demonstrated by the definition extraction use case.
  • 关键词:web services; workflows; morphosyntactic tagging; lemmatisation; definition extraction
国家哲学社会科学文献中心版权所有