首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:String Sanitization Under Edit Distance
  • 本地全文:下载
  • 作者:Giulia Bernardini ; Huiping Chen ; Grigorios Loukides
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2020
  • 卷号:161
  • 页码:7:1-7:14
  • DOI:10.4230/LIPIcs.CPM.2020.7
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:Let W be a string of length n over an alphabet Σ, k be a positive integer, and ð'® be a set of length-k substrings of W. The ETFS problem asks us to construct a string X_{ED} such that: (i) no string of ð'® occurs in X_{ED}; (ii) the order of all other length-k substrings over Σ is the same in W and in X_{ED}; and (iii) X_{ED} has minimal edit distance to W. When W represents an individual’s data and ð'® represents a set of confidential substrings, algorithms solving ETFS can be applied for utility-preserving string sanitization [Bernardini et al., ECML PKDD 2019]. Our first result here is an algorithm to solve ETFS in ð'ª(kn²) time, which improves on the state of the art [Bernardini et al., arXiv 2019] by a factor of Σ . Our algorithm is based on a non-trivial modification of the classic dynamic programming algorithm for computing the edit distance between two strings. Notably, we also show that ETFS cannot be solved in ð'ª(n^{2-δ}) time, for any δ>0, unless the strong exponential time hypothesis is false. To achieve this, we reduce the edit distance problem, which is known to admit the same conditional lower bound [Bringmann and Künnemann, FOCS 2015], to ETFS.
  • 关键词:String algorithms; data sanitization; edit distance; dynamic programming; conditional lower bound
国家哲学社会科学文献中心版权所有