期刊名称:Bonfring International Journal of Software Engineering and Soft Computing
印刷版ISSN:2250-1045
电子版ISSN:2277-5099
出版年度:2018
卷号:8
期号:1
页码:23-25
DOI:10.9756/BIJSESC.8384
语种:English
出版社:Bonfring
摘要:In present, duplicate detection methods need to process ever larger datasets in ever shorter time: It is difficult to maintain the dataset. This project presents progressive duplicate detection algorithm that gradually increase the efficiency of finding duplicates if the execution time is limited: They maximize the gain of the overall process within the available time by reporting most results. These experiments show that progressive algorithms can double the efficiency over time of traditional duplicate detection and improve the work. Progressive duplicate detection identifies most duplicate pairs in the detection process. Instead of reducing the overall time needed to finish the entire process, this approaches tries to reduce the average time.