首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:A New Term-ranking Approach that Supports Improved Searching in Literature Digital Libraries
  • 本地全文:下载
  • 作者:Sulieman Bani-Ahmad ; Ghadeer Al-Dweik
  • 期刊名称:Research Journal of Information Technology
  • 印刷版ISSN:1815-7432
  • 电子版ISSN:2151-7959
  • 出版年度:2011
  • 卷号:3
  • 期号:1
  • 页码:44-52
  • DOI:10.3923/rjit.2011.44.52
  • 出版社:Academic Journals Inc., USA
  • 摘要:In example-based searching, users look for the set of most similar publications to a given one. This requires estimating similarities between publications. A tf.idf formula can be used to compute publication-to-publication text-based similarity, e.g., the Okapi BM25 formula. Studies show that augmenting the importance of search terms in the BM25 formulae improve similarity scores. To this end, we introduce a term-ranking technique and use it for improving publication similarity scores. The proposed term-ranking algorithm is a slight modification of the TextRank algorithm that utilizes the well-known PageRank algorithm to identify the important term/phrases within texts. The proposed approach considers the length of sentences to identify links between terms rather than considering fixed window size. We experimentally found that the proposed approach works well when paired with Okapi BM25.
国家哲学社会科学文献中心版权所有