摘要:Collaborative Filter is proved to be effective in recommendations and widely used in the recommender system for online stores. The mechanism of this method is to find similarities among users in rating score. The item can be recommended based on the similar user’s choice. The calculation of user similarities is based on distance metrics and vector similarity measures. However, the effect of CF methods is limited by several problems, such as the new item problem and how to recommend the items in the long-tail. The data sparsity, which means fewer scores in user rating matrix, can lead to difficulties in finding a relationship among users for recommendations. It is particularly important to design new similarity metrics which is based on the inherent relationship between items rather than rating score by users. In this paper, we introduce an approach using ontology-based similarity to estimate missing values in the user rating matrix. To accommodate different features of items, we investigate several kinds of metrics to estimate the similarity of item ontology, such as Tversky’s similarity, Spearman’s rank correlation coefficient, and Latent Dirichlet Allocation. The missing rating score was filled by the mechanism based on the similarity of the item ontology. With the new rating matrix, the original CF method could get better performance in recall. Experiments using Hetrec’11 dataset were carried out to evaluate the proposed methods using Top-N recall metrics. The results show the effect of the proposed method compared with state-of-the-art approaches when applied to new item cold start and long-tail situations.
关键词:ontology similarity; recommender system; matrix factorization; data sparsity