首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:Matrix-Based Method for Inferring Elements in Data Attributes Using a Vector Space Model
  • 本地全文:下载
  • 作者:Teruaki Hayashi ; Teruaki Hayashi ; Yukio Ohsawa
  • 期刊名称:Information
  • 电子版ISSN:2078-2489
  • 出版年度:2019
  • 卷号:10
  • 期号:3
  • 页码:107
  • DOI:10.3390/info10030107
  • 出版社:MDPI Publishing
  • 摘要:This article addresses the task of inferring elements in the attributes of data. Extracting data related to our interests is a challenging task. Although data on the web can be accessed through free text queries, it is difficult to obtain results that accurately correspond to user intentions because users might not express their objects of interest using exact terms (variables, outlines of data, etc.) found in the data. In other words, users do not always have sufficient knowledge of the data to formulate an effective query. Hence, we propose a method that enables the type, format, and variable elements to be inferred as attributes of data when a natural language summary of the data is provided as a free text query. To evaluate the proposed method, we used the Data Jacket’s datasets whose metadata is written in natural language. The experimental results indicate that our method outperforms those obtained from string matching and word embedding. Applications based on this study can support users who wish to retrieve or acquire new data.
  • 关键词:data jacket; variable label; metadata; natural language processing; market of data; vector space model data jacket ; variable label ; metadata ; natural language processing ; market of data ; vector space model
国家哲学社会科学文献中心版权所有