首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Measuring Topic Coherence through Optimal Word Buckets
  • 本地全文:下载
  • 作者:Nitin Ramrakhiyani ; Sachin Pawar ; Swapnil Hingmire
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2017
  • 卷号:2017
  • 页码:437-442
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:Measuring topic quality is essential for scoring the learned topics and their subsequent use in Information Retrieval and Text classification. To measure quality of Latent Dirichlet Allocation (LDA) based topics learned from text, we propose a novel approach based on grouping of topic words into buckets (TBuckets). A single large bucket signifies a single coherent theme, in turn indicating high topic coherence. TBuckets uses word embeddings of topic words and employs singular value decomposition (SVD) and Integer Linear Programming based optimization to create coherent word buckets. TBuckets outperforms the state-of-the-art techniques when evaluated using 3 publicly available datasets and on another one proposed in this paper.
国家哲学社会科学文献中心版权所有