期刊名称:International Journal of Computer Science Issues
印刷版ISSN:1694-0784
电子版ISSN:1694-0814
出版年度:2014
卷号:11
期号:1
出版社:IJCSI Press
摘要:Inverse document frequency (IDF) is one of the most useful and widely used concepts in information retrieval. When it is used in combination with the term frequency (TF), the result is a very effective term weighting scheme (TF-IDF) that has been applied in information retrieval to determine the weight of the terms. Terms with high TF-IDF values imply a strong relationship with the document they appear in. If that term appears in a query, the document can be of most interest to the user. Term frequency is computed as the number of occurrences of a term in a document whereas there are various methods for measuring the value of IDF; one of the most famous derivations follows from the Robertson-Spark Jones relevance weight. Besides the most famous method for computation of IDF, there are also various methods for computation of inverse document frequency that affects the relevance of a document. In this paper, we have discussed and compared different derivations of inverse document frequency to measure the weight of terms.
关键词:Information Retrieval; Term;Frequency; IDF; Vector space model.