LSH Banding for Large-Scale Retrieval with Memory and Recall Constraints
Venue
International Conference on Acoustics, Speech, and Signal Processing, IEEE (2009)
Publication Year
2009
Authors
Michele Covell, Shumeet Baluja
BibTeX
Abstract
Locality Sensitive Hashing (LSH) is widely used for efficient retrieval of
candidate matches in very large audio, video, and image systems. However, extremely
large reference databases necessitate a guaranteed limit on the memory used by the
table lookup itself, no matter how the entries crowd different parts of the
signature space, a guarantee that LSH does not give. In this paper, we provide such
guaranteed limits, primarily through the design of the LSH bands. When combined
with data-adaptive bin splitting (needed on only 0.04% of the occupied bins), this
approach provides the required guarantee on memory usage. At the same time, it
avoids the reduced recall that more extensive use of bin splitting would give.
