期刊名称:Indian Journal of Education and Information Management
印刷版ISSN:2277-5367
电子版ISSN:2277-5374
出版年度:2016
卷号:5
期号:4
页码:1-7
语种:English
出版社:Indian Society for Education and Environment
其他摘要:Objectives : To reduce the length of hash codes in Local Sensitive Hashing (LSH) Methods : Heterogeneous information network is a network where computers and other devices with different operating system are connected together. Today heterogeneous information network gets more attention in a network. But data mining becomes more difficult in heterogeneous information network. Similarity join is more important for many applications like online advertising, friend recommendation etc., similarity join is a measure of relationship between any two objects or strings or nodes. In this paper we considered the semantic meaning behind the paths to give top k similar pairs through Path-based Similarity join (PS-join) method. Then the expensive computations are removed by using bucket based data dependent hashing while the Local Sensitive Hashing is more expensive and involves more tedious process like to hold lengthier hash codes and approximate near neighbor problem. Findings : The proposed data dependent hashing reduced the computation cost, memory and storage cost of hash codes and also overcome the problem of approximate near neighbor. The experimental results prove that the proposed technique works more efficiently than the existing technique in terms of recall, running time, and error ratio. Application/Improvements : To increase the recall and to reduce the computation time and error ratio PS join with data dependent hashing is proposed.
关键词:Heterogeneous Information Network; Top K Similar Pairs; Similarity Join; Path Based Similarity Join; Local Sensitive Hashing; Data Dependent Hashing.