首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:A Self-Aggregated Hierarchical Topic Model for Short Texts
  • 本地全文:下载
  • 作者:Yue Niu ; Hongjie Zhang
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2021
  • 卷号:11
  • 期号:12
  • 语种:English
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:With the growth of the internet, short texts such as tweets from Twitter, news titles from the RSS, or comments from Amazon have become very prevalent. Many tasks need to retrieve information hidden from the content of short texts. So ontology learning methods are proposed for retrieving structured information. Topic hierarchy is a typical ontology that consists of concepts and taxonomy relations between concepts. Current hierarchical topic models are not specially designed for short texts. These methods use word co-occurrence to construct concepts and general-special word relations to construct taxonomy topics. But in short texts, word cooccurrence is sparse and lacking general-special word relations. To overcome this two problems and provide an interpretable result, we designed a hierarchical topic model which aggregates short texts into long documents and constructing topics and relations. Because long documents add additional semantic information, our model can avoid the sparsity of word cooccurrence. In experiments, we measured the quality of concepts by topic coherence metric on four real-world short texts corpus. The result showed that our topic hierarchy is more interpretable than other methods.
  • 关键词:Hierarchical Topic Model;Texts Analysis;Short Texts;Data Mining
国家哲学社会科学文献中心版权所有