首页    期刊浏览 2024年10月01日 星期二
登录注册

文章基本信息

  • 标题:Levels of Annotation in the Slovene Training Corpus ssj500k 2.2
  • 本地全文:下载
  • 作者:Mija Bon ; Polona Gantar
  • 期刊名称:Journal of Linguistics/Jazykovedný casopis
  • 印刷版ISSN:0021-5597
  • 出版年度:2019
  • 卷号:70
  • 期号:2
  • 页码:390-399
  • DOI:10.2478/jazcas-2019-0068
  • 出版社:Walter de Gruyter GmbH
  • 摘要:This paper presents the Slovene Training Corpus ssj500k 2.2, which has been annotated on the levels of tokenization, sentence segmentation, part-of-speech tagging, lemmatization, syntactic dependencies, named entities, verbal multi-word expressions, and semantic role labeling. It describes the individual layers of annotation and shows the scope of using the training corpus in the production of various lexicons, such as the lexicon of multi-word units and the valency lexicon of modern Slovene. It concludes by presenting our future work, i.e. the annotation of multi-word expressions based on the Slovene Lexical Database.
  • 关键词:corpus linguistics ; training corpus ; corpus annotation ; Slovene language
国家哲学社会科学文献中心版权所有