期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2017
卷号:6
期号:7
页码:15422
DOI:10.15680/IJIRSET.2017.0607383
出版社:S&S Publications
摘要:Data deduplication is a unique data compression technique. It has been extensively adopted to savebackup time in addition to storage area, especially in backup storage systems. Rapid increase in data and associatedexpenses has inspired the need to optimize storage and transfer of data records. Deduplication has proven a incrediblypowerful technology in removing redundancy in backup data storage. With the explosive increase in data sizes, the I/Obottleneck has emerge as an more and more daunting task for big data analytics within the Cloud. Previous researcheshave shown that slight to high data redundancy truly exists in primary storage systems within the Cloud. Furthermore,at once making use of data deduplication to primary storage systems inside the Cloud will probably reason spacecontention in physical memory and information fragmentation on disks. Primarily based on these observations, wesuggest a overall performance-orientated I/O deduplication, referred to as POD, in place of a capability-orientated I/Odeduplication, exemplified by using iDedup, to enhance the I/O performance of primary storage structures in the Cloudwithout sacrificing capability savings of the latter. POD takes a -pronged method to improving the overall performanceof primary storage structures and minimizing overall performance overhead of deduplication, specifically, a requestbasedselective deduplication method, known as select-Dedupe, to alleviate the data fragmentation and an adaptivememory control scheme, referred to as iCache, to ease the memory competition among the bursty read traffic and thebursty write visitors. Our evaluation consequences additionally display that POD achieves comparable or higherpotential memory savings than iDedup.