首页    期刊浏览 2025年02月19日 星期三
登录注册

文章基本信息

  • 标题:A Bloom Filter for High Dimensional Vectors
  • 本地全文:下载
  • 作者:Chunyan Shuai ; Hengcheng Yang ; Xin Ouyang
  • 期刊名称:Information
  • 电子版ISSN:2078-2489
  • 出版年度:2018
  • 卷号:9
  • 期号:7
  • 页码:159
  • DOI:10.3390/info9070159
  • 语种:English
  • 出版社:MDPI Publishing
  • 摘要:Regardless of the type of data, traditional Bloom filters treat each element of a set as a string, and by iterating every character of the string, they discretize all data randomly and uniformly. However, with the data size and dimension increases, these variants are inefficient. To better discretize vectors with high numerical dimensions, this paper improves the string hashes to integer hashes. Based on the integer hashes and a counter array, we propose a new variant—high-dimensional bloom filter (HDBF)—to extend the Bloom filter into high-dimensional spaces, which can represent and query numerical vectors of a big set with a low false positive probability. This paper theoretically analyzes the feasibility of the integer hashes on discretizing data and discusses the relationship of parameters of the HDBF. The experiments illustrate that, in high-dimensional numerical spaces, the HDBF shows better randomness on distribution and entropy than that of the counting Bloom filter. Compared with the parallel Bloom filters, for a fixed false positive probability, the HDBF displays time-space overheads, and is more suitable to deal with the numerical vectors with high dimensions.
  • 关键词:Bloom filter; high-dimensional numerical vector; high-dimensional Bloom filter; integer hash functions Bloom filter ; high-dimensional numerical vector ; high-dimensional Bloom filter ; integer hash functions
国家哲学社会科学文献中心版权所有