摘要:In information retrieval, minwise hashing algorithm is often used to estimate similarities among documents. b -bit minwise hashing is capable of gaining substantial advantages in terms of computational efficiency and storage space by only storing the lowest b bits of each (minwise) hashed value (e.g., b =1 or 2). In this paper, we propose a fractional bit hashing method, which extends the existing b -bit Minwise hashing. It is shown theoretically that the fractional bit hashing has a wider range of selectivity for accuracy and storage space requirements. Theoretical analysis and experimental results demonstrate the effectiveness of this method.