首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction
  • 本地全文:下载
  • 作者:Antonio Di Marco ; Roberto Navigli
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2013
  • 卷号:39
  • 期号:3
  • 页码:709-754
  • DOI:10.1162/COLI_a_00148
  • 语种:English
  • 出版社:MIT Press
  • 摘要:Web search result clustering aims to facilitate information search on the Web. Rather than the results of a query being presented as a flat list, they are grouped on the basis of their similarity and subsequently shown to the user as a list of clusters. Each cluster is intended to represent a different meaning of the input query, thus taking into account the lexical ambiguity (i.e., polysemy) issue. Existing Web clustering methods typically rely on some shallow notion of textual similarity between search result snippets, however. As a result, text snippets with no word in common tend to be clustered separately even if they share the same meaning, whereas snippets with words in common may be grouped together even if they refer to different meanings of the input query. In this article we present a novel approach to Web search result clustering based on the automatic discovery of word senses from raw text, a task referred to as Word Sense Induction. Key to our approach is to first acquire the various senses (i.e., meanings) of an ambiguous query and then cluster the search results based on their semantic similarity to the word senses induced. Our experiments, conducted on data sets of ambiguous queries, show that our approach outperforms both Web clustering and search engines.
国家哲学社会科学文献中心版权所有