摘要:Personal information is increasingly gathered and used for providing services tailored to user preferences, but the datasets used to provide such functionality can represent serious privacy threats if not appropriately protected. Work in privacy-preserving data publishing targeted privacy guarantees that protect against record re-identification, by making records indistinguishable, or sensitive attribute value disclosure, by introducing diversity or noise in the sensitive values. However, most approaches fail in the high-dimensional case, and the ones that don't introduce a utility cost incompatible with tailored recommendation scenarios. This paper aims at a sensible trade-off between privacy and the benefits of tailored recommendations, in the context of privacy-preserving data publishing. We empirically demonstrate that significant privacy improvements can be achieved at a utility cost compatible with tailored recommendation scenarios, using a simple partition-based sanitization method
关键词:Recommender systems; Data anonymization and sanitization; Privacy-preserving data ; publishing; Rating prediction; Tailored recommendations; High-dimensional datasets