首页    期刊浏览 2024年10月01日 星期二
登录注册

文章基本信息

  • 标题:Augmenting Linguistic Semi-Structured Data for Machine Learning - A Case Study using Framenet
  • 本地全文:下载
  • 作者:Breno W. S. R. Carvalho ; Aline Paes ; Bernardo Gonçalves
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2020
  • 卷号:10
  • 期号:12
  • 页码:1-13
  • DOI:10.5121/csit.2020.101201
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Semantic Role Labelling (SRL) is the process of automatically finding the semantic roles of terms in a sentence. It is an essential task towards creating a machine-meaningful representation of textual information. One public linguistic resource commonly used for this task is the FrameNet Project. FrameNet is a human and machine-readable lexical database containing a considerable number of annotated sentences, those annotations link sentence fragments to semantic frames. However, while the annotations across all the documents covered in the dataset link to most of the frames, a large group of frames lack annotations in the documents pointing to them. In this paper, we present a data augmentation method for FrameNet documents that increases by over 13% the total number of annotations. Our approach relies on lexical, syntactic, and semantic aspects of the sentences to provide additional annotations. We evaluate the proposed augmentation method by comparing the performance of a state-of-the-art semantic-role-labelling system, trained using a dataset with and without augmentation.
  • 关键词:FrameNet ;Frame Semantic Parsing ;Semantic Role Labelling ;Data Augmentation.
国家哲学社会科学文献中心版权所有