期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2017
卷号:5
期号:7
页码:13431
DOI:10.15680/IJIRCCE.2017.0507088
出版社:S&S Publications
摘要:In the recent trend, every data and contents are stored in the cloud using cloud storage services. With thehuge amount of data from every client may affect the cloud storage. In specific, the redundant content may performmore worst in the storage part. The de-duplication method is generally used to reduce the storage cost and resourcerequirements of data services in the cloud by eliminating redundant data and storing only a single copy of them. Deduplicationis most effective when multiple users outsource the same data to the cloud storage services, but it createsseveral issues relating to search and security. Data mining is an effective way to solve such problems in the cloudservice. This paper surveys various techniques and methods used to detect duplicate records in the cloud storageservice.
关键词:Cloud storage; Duplicate document; near duplicate pages; near duplicate detection; Detection;approaches; data mining.