出版社:The Japanese Society for Artificial Intelligence
摘要:Learning to rank is a supervised learning problem whose goal is to construct a ranking model. In recent years, online learning to rank algorithms have begun to attract attention because large-scale datasets have become available. We propose a selective pairwise approach to online learning to rank algorithms that offer both fast learning and high performance. The basic strategy of our method is to select the most effective document pair to minimize the objective function using an entered query present in the training data, and then updates the current weight vector by using only the selected document pair instead of using all document pairs in the query. The main characteristics of our method are that it utilizes adaptive margin rescaling based on the approximated NDCG to reflect the IR evaluation measure, the max-loss update procedure, and ramp loss to reduce the over-fitting problem. Finally, we implement our proposal, PARank-NDCG, in the framework of the Passive-Aggressive algorithm. We conduct experiments on the MSLR-WEB datasets, which contain 10,000 and 30,000 queries. Our experiments show that PARank-NDCG outperforms conventional algorithms including online learning to rank algorithms such as Stochastic Pairwise Descent, Committee Perceptron and batch algorithm such as RankingSVM on NDCG values. In addition, our method only takes 7 seconds to learn a model on the MSLR-WEB10K dataset. PARank-NDCG offers approximately 63 times faster training than RankingSVM on average.
关键词:information retrieval ; learning to rank ; online learning