期刊名称:Indian Journal of Computer Science and Engineering
印刷版ISSN:2231-3850
电子版ISSN:0976-5166
出版年度:2021
卷号:12
期号:4
页码:1102-1111
DOI:10.21817/indjcse/2021/v12i4/211204165
语种:English
出版社:Engg Journals Publications
摘要:Development of efficient methods for detection of data deduplication process is interesting as well as challenging in the computing scenario of intensive applications in data, especially in cloud computing. With an advent of machine learning algorithms, challenges in data deduplication process have been reduced to great extent but achieving the higher accuracy of deduplication process still remains in the darker side of the research. This paper presents the novel approach of implementing active feed forward learning models to detect the data deduplication process in the context of digital gazette records. The proposed framework discusses about the extraction of various similarity features such as semantic similarity vectors, time stamp vectors to add the efficiency for the supervised active feed forward learning models. The comprehensive experimentations have been carried out using the different machine learning algorithms and performance metrics such as deduplication accuracy, precision and recall with time complexity were calculated and analyzed. Simulation results shows that the proposed active learning models has outperformed the other learning models which proves more efficient for the data deduplication process.