首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Mind the Gap: Essentially Optimal Algorithms for Online Dictionary Matching with One Gap
  • 本地全文:下载
  • 作者:Amihood Amir ; Tsvi Kopelowitz ; Avivit Levy
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2016
  • 卷号:64
  • 页码:12:1-12:12
  • DOI:10.4230/LIPIcs.ISAAC.2016.12
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:We examine the complexity of the online Dictionary Matching with One Gap Problem (DMOG) which is the following. Preprocess a dictionary D of d patterns, where each pattern contains a special gap symbol that can match any string, so that given a text that arrives online, a character at a time, we can report all of the patterns from D that are suffixes of the text that has arrived so far, before the next character arrives. In more general versions the gap symbols are associated with bounds determining the possible lengths of matching strings. Online DMOG captures the difficulty in a bottleneck procedure for cyber-security, as many digital signatures of viruses manifest themselves as patterns with a single gap. In this paper, we demonstrate that the difficulty in obtaining efficient solutions for the DMOG problem, even in the offline setting, can be traced back to the infamous 3SUM conjecture. We show a conditional lower bound of Omega(delta(G_D)+op) time per text character, where G_D is a bipartite graph that captures the structure of D, delta(G_D) is the degeneracy of this graph, and op is the output size. Moreover, we show a conditional lower bound in terms of the magnitude of gaps for the bounded case, thereby showing that some known offline upper bounds are essentially optimal. We also provide matching upper-bounds (up to sub-polynomial factors), in terms of the degeneracy, for the online DMOG problem. In particular, we introduce algorithms whose time cost depends linearly on delta(G_D). Our algorithms make use of graph orientations, together with some additional techniques. These algorithms are of practical interest since although delta(G_D) can be as large as sqrt(d), and even larger if G_D is a multi-graph, it is typically a very small constant in practice. Finally, when delta(G_D) is large we are able to obtain even more efficient solutions.
  • 关键词:Pattern matching; Dictionary matching; 3SUM; Triangle reporting
国家哲学社会科学文献中心版权所有