首页    期刊浏览 2024年07月02日 星期二
登录注册

文章基本信息

  • 标题:Topic-sensitive multi-document summarization algorithm
  • 本地全文:下载
  • 作者:Na, Liu ; Di, Tang ; Ying, Lu
  • 期刊名称:Computer Science and Information Systems
  • 印刷版ISSN:1820-0214
  • 电子版ISSN:2406-1018
  • 出版年度:2015
  • 卷号:12
  • 期号:4
  • 页码:1375-1389
  • DOI:10.2298/CSIS140815060N
  • 出版社:ComSIS Consortium
  • 摘要:Latent Dirichlet Allocation (LDA) has been used to generate text corpora topics recently. However, not all the estimated topics are of equal importance or correspond to genuine themes of the domain. Some of the topics can be a collection of irrelevant words or represent insignificant themes. This paper proposed a topic-sensitive algorithm for multi-document summarization. This algorithm uses LDA model and weight linear combination strategy to identify significance topic which is used in sentence weight calculation. Each topic is measured by three different LDA criteria. Significance topic is evaluated by using weight linear combination to combine the multi-criteria. In addition to topic features, the proposed approach also considered some statistics features, such as term frequency, sentence position, sentence length, etc. It not only highlights the advantages of statistics features, but also cooperates with topic model. The experiments showed that the proposed algorithm achieves better performance than the other state-of-the-art algorithms on DUC2002 corpus.
  • 关键词:multi-document summarization; LDA; topic model; weighted linear combination
国家哲学社会科学文献中心版权所有