摘要:Set-valued database publication has been increasing its importance recently due to its benefit for various applications such as marketing analysis and advertising. However, publishing a raw set-valued database may cause individual privacy breach such as the leakage of sensitive information like personal tendencies when data recipients perform data analysis. Even though imposing data anonymization methods such as suppression-based methods and random data swapping methods to such a database can successfully hide personal tendency, it induces item loss from records and causes significant distortion in record structure that degrades database utility. To avoid the problems, we proposed a method based on swapping technique where an individual’s items in a record are swapped to items of the other record. Our swapping technique is distinct from existing one called random data swapping which yields much structure distortion. Even though the technique results in inaccuracy at a record level, it can preserve every single item in a database from loss. Thus, data recipients may obtain all the item information in an anonymized database. In addition, by carefully selecting a pair of records for item swapping, we can avoid excessive record structure distortion that leads to alter database content immensely. More importantly, such a strategy allows one to successfully hide personal tendency without sacrificing a lot of database utility.
关键词:set-valued database; data anonymization; hiding personal tendency; swapping technique set-valued database ; data anonymization ; hiding personal tendency ; swapping technique