首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:Efficient Hybrid Semantic Text Similarity using Wordnet and a Corpus
  • 本地全文:下载
  • 作者:Issa Atoum ; Ahmed Otoom
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2016
  • 卷号:7
  • 期号:9
  • DOI:10.14569/IJACSA.2016.070917
  • 出版社:Science and Information Society (SAI)
  • 摘要:Text similarity plays an important role in natural language processing tasks such as answering questions and summarizing text. At present, state-of-the-art text similarity algorithms rely on inefficient word pairings and/or knowledge derived from large corpora such as Wikipedia. This article evaluates previous word similarity measures on benchmark datasets and then uses a hybrid word similarity in a novel text similarity measure (TSM). The proposed TSM is based on information content and WordNet semantic relations. TSM includes exact word match, the length of both sentences in a pair, and the maximum similarity between one word and the compared text. Compared with other well-known measures, results of TSM are surpassing or comparable with the best algorithms in the literature.
  • 关键词:thesai; IJACSA; thesai.org; journal; IJACSA papers; text similarity; distributional similarity; information content; knowledge-based similarity; corpus-based similarity; WordNet
国家哲学社会科学文献中心版权所有