期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2014
期号:ICETS
页码:343
出版社:S&S Publications
摘要:A large number of cloud services requiresusers to share private data like electronic healthrecords for data analysis or mining, bringingprivacy concerns. Anonymizing data sets viageneralization to satisfy certain privacyrequirements such as k-anonymity is a widely usedcategory of privacy preserving techniques. Atpresent, the scale of data in many cloudapplications increases tremendously in accordancewith the Big Data trend, thereby making it achallenge for commonly used software tools tocapture, manage and process such large-scale datawithin a tolerable elapsed time. As a result ischallenge for existing anonymization approaches toachieve privacy preservation on privacy-sensitivelarge-scale data sets due to their insufficiency ofscalability. An introduce the scalable two-phasetop-down specialization approach to anonymizelarge-scale data sets using the MapReduceframework on cloud. In both phases of approach isdeliberately design a group of innovativeMapReduce jobs to concretely accomplish thespecialization computation in a highly scalable way.Experimental evaluation results demonstrate thatwith this approach. The scalability and efficiency oftop-down specialization can be improvedsignificantly over existing approaches. An introducethe scheduling mechanism called OptimizedBalanced Scheduling to apply the Anonymization.Here the OBS means individual dataset have theseparate sensitive field. Every data set sensitivefield and give priority for this sensitive field. Thenapply Anonymization on this sensitive field onlydepending upon the scheduling.
关键词:Data Anonymization; Top-Down;specialization; Map-Reduce; Cloud; Privacy;Preservation; OBS; Data Partition; Data Merging