文章基本信息

标题：Keyword Extraction from a Document using Word Co-occurrence Statistical Information
本地全文：下载
作者：Yutaka Matsuo ; Mitsuru Ishizuka
期刊名称：人工知能学会論文誌
印刷版ISSN：1346-0714
电子版ISSN：1346-8030
出版年度：2002
卷号：17
期号：3
页码：217-223
DOI：10.1527/tjsai.17.217
出版社：The Japanese Society for Artificial Intelligence
摘要：We present a new keyword extraction algorithm that applies to a single document without using a large corpus. Frequent terms are extracted first, then a set of co-occurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. The distribution of co-occurrence shows the importance of a term in the document as follows. If the probability distribution of co-occurrence between term a and the frequent terms is biased to a particular subset of the frequent terms, then term a is likely to be a keyword. The degree of the biases of the distribution is measured by χ²-measure. We show our algorithm performs well for indexing technical papers.
关键词：keyword extraction ; word co-occurrence ; χ² test