文章基本信息

标题：Relevance Feedback using Surface and Latent Information in Texts
本地全文：下载
作者：Jun Harashima ; Sadao Kurohashi
期刊名称：Information and Media Technologies
电子版ISSN：1881-0896
出版年度：2014
卷号：9
期号：4
页码：814-833
DOI：10.11185/imt.9.814
出版社：Information and Media Technologies Editorial Board
摘要：Most relevance feedback methods re-rank search results using only the information of surface words in texts. We present a method that uses not only the information of surface words but also that of latent words that are inferred from texts. We infer latent word distribution in each document in the search results using latent Dirichlet allocation (LDA). When feedback is given, we also infer the latent word distribution in the feedback using LDA. We calculate the similarities between the user feedback and each document in the search results using both the surface and latent word distributions and re-rank the search results on the basis of the similarities. Evaluation results show that when user feedback consisting of two documents (3, 589 words) is given, the proposed method improves the initial search results by 27.6% in precision at 10 (P@10). Additionally, it proves that the proposed method can perform well even when only a small amount of user feedback is available. For example, an improvement of 5.3% in P@10 was achieved when user feedback constituted only 57 words.
关键词：Information Retrieval;Relevance Feedback;Latent Dirichlet Allocation