文章基本信息

标题：Utilization Of Cross-Terms To Enhance The Language Model For Information Retrieval
本地全文：下载
作者：Huda Mohammed Barakat ; Maizatul Akmar Ismail ; Sri Devi Ravana 等
期刊名称：Malaysian Journal of Computer Science
印刷版ISSN：0127-9084
出版年度：2013
卷号：26
期号：3
出版社：University of Malaya * Faculty of Computer Science and Information Technology
摘要：Traditional retrieval models were effective in the early stage of the Web; however, with the huge amount of information that is available on the Web today further optimization is required to enhance the performance of these models in extracting the most relevant information. Utilization of the term proximity is one of the techniques that have been introduced for this purpose by many researchers. It assumes that the words in the user query are correlated and thus proximity between them should be considered in the matching process. Densitybased proximity is an effectual type of term proximity measures which is still not fully considered in the retrieval models. In this paper we investigate the application of a recent densitybased measure called CrossTerms which has achieved significant scores when applied on the effective BM25 retrieval model. We applied crossterms on another effective retrieval model that is the Language Modeling Approach. The performance of the enhanced language model was measured and evaluated through several experiments and metrics. Experiments results show that the crossterms measure was able to improve the performance of the basic language model in all the applied evaluation metrics. Performance improvement reached (+4%) with the MAP metric and (+8%) with P@5 and P@20 metrics.
关键词：information retrieval; crossterms; kernel; proximity; language model