摘要:Synthetic data generation has been proposed as a flexible alternative to more traditionalstatistical disclosure control (SDC) methods for minimising disclosure risk. However, a barrier to theuse of synthetic data is the uncertainty about the reliability and validity of the results that are derivedfrom these data. Surprisingly, there has been a relative dearth of research on how to measure theutility of synthetic data. Utility measures developed to date have been either information theoreticabstractions or somewhat arbitrary collations of statistics, and replication of previously publishedresults has been rare. In this paper, we adopt a methodology previously used by Purdam and Elliot(2007), in which they replicated published analyses using disclosure-controlled versions of thesame microdata used in said analyses and then evaluated the impact of disclosure control on the analyticoutcomes. We utilise the same studies as Purdam and Elliot, based on the 1991 UK Samplesof Anonymised Records, to facilitate comparisons of synthetic data utility between different utilitymetrics..