期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
印刷版ISSN:2347-6710
电子版ISSN:2319-8753
出版年度:2016
卷号:5
期号:9
页码:16086
DOI:10.15680/IJIRSET.2016.0509025
出版社:S&S Publications
摘要:Recently there has been a growing interest in designing differentially private data mining algorithms. Avariety of algorithms have been proposed for mining frequent itemsets. Frequent itemset mining (FIM) is one of themost fundamental problems in data mining. It has practical importance in a wide range of application areas such asdecision support, web usage mining, bioinformatics, etc. In this paper, It explore the possibility of designing adifferentially private FIM algorithm which can not only achieve high data utility and a high degree of privacy, but alsooffer high time efficiency. To this end, a differentially private FIM algorithm based on the FP-growth algorithm, whichis referred to as PFP-growth. The PFP-growth consist of a preprocessing phase and a mining phase. In thepreprocessing phase, to improve the utility and privacy tradeoff, a novel smart splitting method is proposed totransform the database. For a given database, the preprocessing phase needs to be performed only once. In the miningphase, to offset the information loss caused by transaction splitting,we devise a run-time estimation method to estimatethe actual support of itemsets in the original database. In addition, by leveraging the downward closure property, weput forward a dynamic reduction method to dynamically reduce the amount of noise added to guarantee privacy duringthe mining process. Through formal privacy analysis, PFP-growth algorithm is differentially private. Extensiveexperiments on real datasets illustrate that in PFP-growth algorithm substantially outperforms the state-of-the-arttechnique.
关键词:Private FIM algorithm; Smart splitting; Run time Estimation