首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Range-Efficient Consistent Sampling and Locality-Sensitive Hashing for Polygons
  • 本地全文:下载
  • 作者:Joachim Gudmundsson ; Rasmus Pagh
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2017
  • 卷号:92
  • 页码:42:1-42:13
  • DOI:10.4230/LIPIcs.ISAAC.2017.42
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:Locality-sensitive hashing (LSH) is a fundamental technique for similarity search and similarity estimation in high-dimensional spaces. The basic idea is that similar objects should produce hash collisions with probability significantly larger than objects with low similarity. We consider LSH for objects that can be represented as point sets in either one or two dimensions. To make the point sets finite size we consider the subset of points on a grid. Directly applying LSH (e.g. min-wise hashing) to these point sets would require time proportional to the number of points. We seek to achieve time that is much lower than direct approaches. Technically, we introduce new primitives for range-efficient consistent sampling (of independent interest), and show how to turn such samples into LSH values. Another application of our technique is a data structure for quickly estimating the size of the intersection or union of a set of preprocessed polygons. Curiously, our consistent sampling method uses transformation to a geometric problem.
  • 关键词:Locality-sensitive hashing; probability distribution; polygon; min-wise hashing; consistent sampling
国家哲学社会科学文献中心版权所有