首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Keyword Extraction from a Document using Word Co-occurrence Statistical Information
  • 本地全文:下载
  • 作者:Yutaka Matsuo ; Mitsuru Ishizuka
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2002
  • 卷号:17
  • 期号:3
  • 页码:217-223
  • DOI:10.1527/tjsai.17.217
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:We present a new keyword extraction algorithm that applies to a single document without using a large corpus. Frequent terms are extracted first, then a set of co-occurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. The distribution of co-occurrence shows the importance of a term in the document as follows. If the probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of the frequent terms, then term a is likely to be a keyword. The degree of the biases of the distribution is measured by χ²-measure. We show our algorithm performs well for indexing technical papers.
  • 关键词:keyword extraction ; word co-occurrence ; χ² test
国家哲学社会科学文献中心版权所有