首页    期刊浏览 2024年11月05日 星期二
登录注册

文章基本信息

  • 标题:Automated Extraction of Statistical Expressions from Text for Information Compilation
  • 本地全文:下载
  • 作者:Tatsunori MORI ; Atsushi FUJIOKA ; Ichiro MURATA
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2008
  • 卷号:23
  • 期号:5
  • 页码:310-318
  • DOI:10.1527/tjsai.23.310
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:In order to summarize trend information in document and visualize it, we have to have a method to automatically extract statistical information from document. In this paper, we investigate automated extraction of statistical information, especially, expressions of name of statistical information. First, we classify those expressions into three categories, namely, the action type, the attribute type, and the definition type. Second, the internal structures of them are examined. According to the internal structures, we defined an XML tag set to annotate each part of names of statistical information. As a feasibility study of automated extraction of them, we conducted an experiment in which parts of names of statistics are extracted by using a standard chunking algorithm. The experimental result shows that the parts of names of statistics defined by the tag set can be extracted with good accuracy in the case that we can prepare a training corpus of the domain similar to target documents. On the other hand, the extraction accuracy will be degraded when we cannot prepare such a training corpus.
  • 关键词:MuST(Multimodal Summarization for Trend Information) ; statistical expressions ; information extraction
国家哲学社会科学文献中心版权所有