`

Timezone: »

 
Data-dependent methods for similarity search in high dimensions
Piotr Indyk

I will give an overview of recent research on designing provable data-dependent approximate similarity search methods for high-dimensional data. Unlike many approaches proposed in the algorithms literature, the new methods lead to data structures that adapt to the data while retaining correctness and running time guarantees. The high-level principle behind those methods is that  every data set (even a completely random one) has some structure that can be exploited algorithmically.  This interpolates between basic "data-oblivious" approaches (e.g., those based on  simple random projections) that are analyzable but do not adapt easily to the structure of the data, and basic "data-dependent" methods that exploit specific structure of the data, but might not perform well if this structure is not present.

Concretely, I will cover two recent lines of research: - Data-dependent Locality-Sensitive Hashing: approximate nearest neighbor search methods that provably improve over the best possible data-oblivious LSH algorithms (covers work from SODA'14, STOC'15 and NIPS'15). - Data-dependent metric compression: algorithms that represent sets of points using a small number of bits while approximately preserving the distances up to higher accuracy than previously known (covers work from SODA'17 and NIPS'17).

Author Information

Piotr Indyk (MIT)

More from the Same Authors

  • 2021 Poster: Few-Shot Data-Driven Algorithms for Low Rank Approximation »
    Piotr Indyk · Tal Wagner · David Woodruff
  • 2019 : Poster Session »
    Lili Yu · Aleksei Kroshnin · Alex Delalande · Andrew Carr · Anthony Tompkins · Aram-Alexandre Pooladian · Arnaud Robert · Ashok Vardhan Makkuva · Aude Genevay · Bangjie Liu · Bo Zeng · Charlie Frogner · Elsa Cazelles · Esteban G Tabak · Fabio Ramos · François-Pierre PATY · Georgios Balikas · Giulio Trigila · Hao Wang · Hinrich Mahler · Jared Nielsen · Karim Lounici · Kyle Swanson · Mukul Bhutani · Pierre Bréchet · Piotr Indyk · samuel cohen · Stefanie Jegelka · Tao Wu · Thibault Sejourne · Tudor Manole · Wenjun Zhao · Wenlin Wang · Wenqi Wang · Yonatan Dukler · Zihao Wang · Chaosheng Dong
  • 2019 : Poster Session »
    Jonathan Scarlett · Piotr Indyk · Ali Vakilian · Adrian Weller · Partha P Mitra · Benjamin Aubin · Bruno Loureiro · Florent Krzakala · Lenka Zdeborová · Kristina Monakhova · Joshua Yurtsever · Laura Waller · Hendrik Sommerhoff · Michael Moeller · Rushil Anirudh · Shuang Qiu · Xiaohan Wei · Zhuoran Yang · Jayaraman Thiagarajan · Salman Asif · Michael Gillhofer · Johannes Brandstetter · Sepp Hochreiter · Felix Petersen · Dhruv Patel · Assad Oberai · Akshay Kamath · Sushrut Karmalkar · Eric Price · Ali Ahmed · Zahra Kadkhodaie · Sreyas Mohan · Eero Simoncelli · Carlos Fernandez-Granda · Oscar Leong · Wesam Sakla · Rebecca Willett · Stephan Hoyer · Jascha Sohl-Dickstein · Samuel Greydanus · Gauri Jagatap · Chinmay Hegde · Michael Kellman · Jonathan Tamir · Nouamane Laanait · Ousmane Dia · Mirco Ravanelli · Jonathan Binas · Negar Rostamzadeh · Shirin Jalali · Tiantian Fang · Alex Schwing · Sébastien Lachapelle · Philippe Brouillard · Tristan Deleu · Simon Lacoste-Julien · Stella Yu · Arya Mazumdar · Ankit Singh Rawat · Yue Zhao · Jianshu Chen · Xiaoyang Li · Hubert Ramsauer · Gabrio Rizzuti · Nikolaos Mitsakos · Dingzhou Cao · Thomas Strohmer · Yang Li · Pei Peng · Gregory Ongie
  • 2019 : Learning-Based Low-Rank Approximations »
    Piotr Indyk
  • 2019 Poster: Estimating Entropy of Distributions in Constant Space »
    Jayadev Acharya · Sourbh Bhadane · Piotr Indyk · Ziteng Sun
  • 2019 Poster: Learning-Based Low-Rank Approximations »
    Piotr Indyk · Ali Vakilian · Yang Yuan
  • 2019 Poster: Space and Time Efficient Kernel Density Estimation in High Dimensions »
    Arturs Backurs · Piotr Indyk · Tal Wagner
  • 2017 Poster: Practical Data-Dependent Metric Compression with Provable Guarantees »
    Piotr Indyk · Ilya Razenshteyn · Tal Wagner
  • 2017 Poster: On the Fine-Grained Complexity of Empirical Risk Minimization: Kernel Methods and Neural Networks »
    Arturs Backurs · Piotr Indyk · Ludwig Schmidt
  • 2016 Poster: Fast recovery from a union of subspaces »
    Chinmay Hegde · Piotr Indyk · Ludwig Schmidt
  • 2015 Poster: Practical and Optimal LSH for Angular Distance »
    Alexandr Andoni · Piotr Indyk · Thijs Laarhoven · Ilya Razenshteyn · Ludwig Schmidt
  • 2014 Workshop: Optimal Transport and Machine Learning »
    Marco Cuturi · Gabriel Peyré · Justin Solomon · Alexander Barvinok · Piotr Indyk · Robert McCann · Adam Oberman