摘要:In recent years, near duplicate image detecting becomes one of the most important problems in image retrieval, and it is widely used in many application fields, such as copyright violations and detecting forged images. Therefore, in this paper, we propose a novel approach to automatically detect near duplicate images based on visual word model. SIFT descriptors are utilized to represent image visual content which is an effective method in computer vision research field to detect local features of images. Afterwards, we cluster the SIFT features of a given image into several clusters by the K-means algorithm. The centroid of each cluster is regarded as a visual word, and all the centroids are used to construct the visual word vocabulary. To reduce the time cost of near duplicate image detecting process, locality sensitive hashing is utilized to map high-dimensional visual features into low-dimensional hash bucket space, and then the image visual features are converted to a histogram. Next, for a pair of images, we present a local feature based image similarity estimating method by computing histogram distance, and then near duplicate images can be detected. Finally, a series of experiments are constructed to make performance evaluation, and related analyses about experimental results are also given
关键词:Near Duplicate Image;Visual Word Model;SIFT;Hash Function;Locality Sensitive Hashing