Poster
Multiscale Quantization for Fast Similarity Search
Xiang Wu · Ruiqi Guo · Ananda Theertha Suresh · Sanjiv Kumar · Daniel Holtmann-Rice · David Simcha · Felix Yu

Tue Dec 5th 06:30 -- 10:30 PM @ Pacific Ballroom #83 #None

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets. The key insight of the approach is that quantization methods, in particular product quantization, perform poorly when there is large variance in the norms of the data points. This is a common scenario for real- world datasets, especially when doing product quantization of residuals obtained from coarse vector quantization. To address this issue, we propose a multiscale formulation where we learn a separate scalar quantizer of the residual norm scales. All parameters are learned jointly in a stochastic gradient descent framework to minimize the overall quantization error. We provide theoretical motivation for the proposed technique and conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods.

Author Information

Xiang Wu (Google)
Ruiqi Guo (Google)
Ananda Theertha Suresh (Google)
Sanjiv Kumar (Google Research)
Daniel Holtmann-Rice (Google Inc)
David Simcha (Google)
Felix Yu (Google Research)

More from the Same Authors