首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Measuring Rule Retention in AnonymizedData - When One Measure Is Not Enough
  • 本地全文:下载
  • 作者:Sam Fletcher ; Md Zahidul Islam
  • 期刊名称:Transactions on Data Privacy
  • 印刷版ISSN:1888-5063
  • 电子版ISSN:2013-1631
  • 出版年度:2017
  • 卷号:10
  • 期号:3
  • 页码:175-201
  • 语种:English
  • 出版社:IIIA-CSIC
  • 摘要:In this paper, we explore how anonymizing data to preserve privacy affects the utility of the classification rules discoverable in the data. In order for an analysis of anonymized data to provide useful results, the data should have as much of the information contained in the original data as possible. Therein lies a problem - how does one make sure that anonymized data still contains the information it had before anonymization? This question is not the same as asking if an accurate classifier can be built from the anonymized data. Often in the literature, the prediction accuracy of a classifier made from anonymized data is used as evidence that the data are similar to the original. We demonstrate that this is not the case, and we propose a new methodology for measuring the retention of the rules that existed in the original data. We then use our methodology to design three measures that can be easily implemented, each measuring aspects of the data that no pre-existing techniques can measure. These measures do not negate the usefulness of prediction accuracy or other measures - they are complementary to them, and support our argument that one measure is almost never enough.
  • 关键词:Machine Learning; Data Mining; Privacy; Patterns; Rules; Utility Measures
国家哲学社会科学文献中心版权所有