首页    期刊浏览 2025年02月23日 星期日
登录注册

文章基本信息

  • 标题:A Topological Method for Comparing Document Semantics
  • 本地全文:下载
  • 作者:Yuqi Kong ; Fanchao Meng ; Ben Carterette
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2020
  • 卷号:10
  • 期号:14
  • 页码:143-151
  • DOI:10.5121/csit.2020.101411
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Comparing document semantics is one of the toughest tasks in both Natural Language Processing and Information Retrieval. To date, on one hand, the tools for this task are still rare. On the other hand, most relevant methods are devised from the statistic or the vector space model perspectives but nearly none from a topological perspective. In this paper, we hope to make a different sound. A novel algorithm based on topological persistence for comparing semantics similarity between two documents is proposed. Our experiments are conducted on a document dataset with human judges’ results. A collection of state-of-the-art methods are selected for comparison. The experimental results show that our algorithm can produce highly human-consistent results, and also beats most state-of-the-art methods though ties with NLTK.
  • 关键词:Topological Graph ;Document Semantics Comparison ;Natural Language Processing ;Information Retrieval ;Topological Persistence.
国家哲学社会科学文献中心版权所有