首页    期刊浏览 2024年09月15日 星期日
登录注册

文章基本信息

  • 标题:The Impact of Synthetic Data Generation on Data Utility with Application to the 1991 UK Samples of Anonymised Records
  • 本地全文:下载
  • 作者:Jennifer Taub ; Mark Elliot ; Joseph W. Sakshaug
  • 期刊名称:Transactions on Data Privacy
  • 印刷版ISSN:1888-5063
  • 电子版ISSN:2013-1631
  • 出版年度:2020
  • 卷号:13
  • 期号:1
  • 页码:1-23
  • 出版社:IIIA-CSIC
  • 摘要:Synthetic data generation has been proposed as a flexible alternative to more traditionalstatistical disclosure control (SDC) methods for minimising disclosure risk. However, a barrier to theuse of synthetic data is the uncertainty about the reliability and validity of the results that are derivedfrom these data. Surprisingly, there has been a relative dearth of research on how to measure theutility of synthetic data. Utility measures developed to date have been either information theoreticabstractions or somewhat arbitrary collations of statistics, and replication of previously publishedresults has been rare. In this paper, we adopt a methodology previously used by Purdam and Elliot(2007), in which they replicated published analyses using disclosure-controlled versions of thesame microdata used in said analyses and then evaluated the impact of disclosure control on the analyticoutcomes. We utilise the same studies as Purdam and Elliot, based on the 1991 UK Samplesof Anonymised Records, to facilitate comparisons of synthetic data utility between different utilitymetrics..
  • 关键词:Synthetic data; CART; multiple imputation; utility metrics
国家哲学社会科学文献中心版权所有