期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2017
卷号:95
期号:8
出版社:Journal of Theoretical and Applied
摘要:Record De-duplication is the important task under merging different database records. We can provide tuning results to the users after implementation of de-duplication operation. Existing approaches are failing under tuning of web databases and removal of duplicate records. All existing approaches are not providing efficient and effective results [1] [2] [3] [4]. In this paper we are designing one new prototype discussion related to effective and enhanced de-duplication. Prototype design starts with fuzzy clustering and genetic algorithm. Its can control more number of duplicate records compare to other approaches. Its saves more storage and time compare to other approaches [12] [13]. In distributed databases the complexity of finding similarity factor is very high. The existing techniques are not accurate to minimize the duplication in the same data base. In the present work a new technique is proposed to improve the accuracy level [24]. In the proposed work a multi-level technical process implemented like tuning. The tuning technique finds all types of duplicated documents in the database. Here all duplicate files are searched with all attributes in sequential order in tree fashion. The results are further improved and reached to an optimized and acceptable range with new data duplication detection method with Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). It further removes unwanted residual files from the database. Bases on the view of previous ranking system problems a new manifold ranking is proposed in the current research work. In the proposed system the ranking is evaluated with new multimodality manifold ranking with sink points.