首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Comparative Analysis of IDF Methods to Determine Word Relevance in Web Document
  • 本地全文:下载
  • 作者:Jitendra Nath Singh
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2014
  • 卷号:11
  • 期号:1
  • 出版社:IJCSI Press
  • 摘要:Inverse document frequency (IDF) is one of the most useful and widely used concepts in information retrieval. When it is used in combination with the term frequency (TF), the result is a very effective term weighting scheme (TF-IDF) that has been applied in information retrieval to determine the weight of the terms. Terms with high TF-IDF values imply a strong relationship with the document they appear in. If that term appears in a query, the document can be of most interest to the user. Term frequency is computed as the number of occurrences of a term in a document whereas there are various methods for measuring the value of IDF; one of the most famous derivations follows from the Robertson-Spark Jones relevance weight. Besides the most famous method for computation of IDF, there are also various methods for computation of inverse document frequency that affects the relevance of a document. In this paper, we have discussed and compared different derivations of inverse document frequency to measure the weight of terms.
  • 关键词:Information Retrieval; Term;Frequency; IDF; Vector space model.
国家哲学社会科学文献中心版权所有