Timezone: »
he notion of similarity (or distance) is central in many problems in machine learning: information retrieval, nearestneighbor based prediction, visualization of highdimensional data, etc. Historically, similarity was estimated via a fixed distance function (typically Euclidean), sometimes engineered by hand using domain knowledge. Using statistical learning methods instead to learn similarity functions is appealing, and over the last decade this problem has attracted much attention in the community with several publications in NIPS, ICML, AISTATS, CVPR etc.
Much of this work, however, has focused on a specific, restricted approach: learning a Mahalanobis distance, under a variety of objectives and constraints. This effectively limits the setup to learning a linear embedding of the data.
In this workshop, we will look beyond this setup, and consider methods that learn nonlinear embeddings of the data, either explicitly via nonlinear mappings or implicitly via kernels. We will especially encourage discussion of methods that are suitable for largescale problems increasingly facing practitioner of learning methods: large number of examples, high dimensionality of the original space, and/or massively multiclass problems (e.g. Classification with 10,000+ categories, 10,000,000 image of ImageNet dataset).
Our goals are to
1. Create a comprehensive understanding of the stateoftheart in similarity learning, via presentation of recent work,
2. Initiate an indepth discussion on major open questions brought up by research in this area. Among these questions:
* Are there gains to be made from introducing nonlinearity into similarity models?
* When the underlying task is prediction (classification or regression) are similarity functions worth learning, instead of attacking the prediction task directly? A closely related question  when is it beneficial to use nearest neighbor based methods, with learned similarity?
* What is the right loss (or objective) function to minimize in similarity learning?
* It is often claimed that inherent structure in real data (e.g. lowdimensional manifolds) makes learning easier. How, if at all, does this affect similarity learning?
* What are similarities/distinctions between learning similarity functions and learning hashing?
* What is the relationship between unsupervised similarity learning (often framed as dimensionality reduction) and the supervised similarity learning?
* Are there models of learning nonlinear similarities for which bounds (e.g., generalization error, regret bounds) can be proven?
* What algorithmic techniques must be employed or developed to scale nonlinear similarity learning to extremely large data sets?
We will encourage the invited speakers to address these questions in their talks, and will steer the panel discussion towards some of these.
Target audience of this workshop consists of two (overlapping) groups:
 practitioners of machine learning who deal with large scale problems where the ability to more accurately predict similarity values is important, and
 core machine learning researchers working on learning similarity/distance/metric and on similaritybased prediction methods.
Author Information
Greg Shakhnarovich (TTIChicago)
Dhruv Batra (Georgia Tech / Facebook AI Research (FAIR))
Brian Kulis (UC Berkeley)
Kilian Q Weinberger (Washington University in St. Louis)
More from the Same Authors

2020 Poster: Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data »
Michael Cogswell · Jiasen Lu · Rishabh Jain · Stefan Lee · Devi Parikh · Dhruv Batra 
2019 Poster: ViLBERT: Pretraining TaskAgnostic Visiolinguistic Representations for VisionandLanguage Tasks »
Jiasen Lu · Dhruv Batra · Devi Parikh · Stefan Lee 
2019 Poster: Chasing Ghosts: Instruction Following as Bayesian State Tracking »
Peter Anderson · Ayush Shrivastava · Devi Parikh · Dhruv Batra · Stefan Lee 
2018 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Erik Wijmans · Samyak Datta · Ethan Perez · Mateusz Malinowski · Stefan Lee · Peter Anderson · Aaron Courville · Jeremie MARY · Dhruv Batra · Devi Parikh · Olivier Pietquin · Chiori HORI · Tim Marks · Anoop Cherian 
2017 Workshop: Visually grounded interaction and language »
Florian Strub · Harm de Vries · Abhishek Das · Satwik Kottur · Stefan Lee · Mateusz Malinowski · Olivier Pietquin · Devi Parikh · Dhruv Batra · Aaron Courville · Jeremie Mary 
2017 Poster: Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model »
Jiasen Lu · Anitha Kannan · Jianwei Yang · Devi Parikh · Dhruv Batra 
2016 Poster: Hierarchical QuestionImage CoAttention for Visual Question Answering »
Jiasen Lu · Jianwei Yang · Dhruv Batra · Devi Parikh 
2016 Poster: Depth from a Single Image by Harmonizing Overcomplete Local Network Predictions »
Ayan Chakrabarti · Jingyu Shao · Greg Shakhnarovich 
2016 Poster: Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles »
Stefan Lee · Senthil Purushwalkam · Michael Cogswell · Viresh Ranjan · David Crandall · Dhruv Batra 
2015 Poster: Fast Distributed kCenter Clustering with Outliers on Massive Data »
Gustavo Malkomes · Matt J Kusner · Wenlin Chen · Kilian Q Weinberger · Benjamin Moseley 
2015 Poster: SubmodBoxes: NearOptimal Search for a Set of Diverse Object Proposals »
Qing Sun · Dhruv Batra 
2014 Workshop: Representation and Learning Methods for Complex Outputs »
Richard Zemel · Dale Schuurmans · Kilian Q Weinberger · Yuhong Guo · Jia Deng · Francesco Dinuzzo · Hal Daumé III · Honglak Lee · Noah A Smith · Richard Sutton · Jiaqian YU · Vitaly Kuznetsov · Luke Vilnis · Hanchen Xiong · Calvin Murdock · Thomas Unterthiner · JeanFrancis Roy · Martin Renqiang Min · Hichem SAHBI · Fabio Massimo Zanzotto 
2014 Workshop: Discrete Optimization in Machine Learning »
Jeff Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher 
2014 Poster: Submodular meets Structured: Finding Diverse Subsets in ExponentiallyLarge Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra 
2014 Poster: Discriminative Metric Learning by Neighborhood Gerrymandering »
Shubhendu Trivedi · David Mcallester · Greg Shakhnarovich 
2014 Spotlight: Submodular meets Structured: Finding Diverse Subsets in ExponentiallyLarge Structured Item Sets »
Adarsh Prasad · Stefanie Jegelka · Dhruv Batra 
2013 Workshop: Output Representation Learning »
Yuhong Guo · Dale Schuurmans · Richard Zemel · Samy Bengio · Yoshua Bengio · Li Deng · Dan Roth · Kilian Q Weinberger · Jason Weston · Kihyuk Sohn · Florent Perronnin · Gabriel Synnaeve · Pablo R Strasser · julien audiffren · Carlo Ciliberto · Dan Goldwasser 
2012 Poster: Multiple Choice Learning: Learning to Produce Multiple Structured Outputs »
Abner GuzmánRivera · Dhruv Batra · Pushmeet Kohli 
2012 Poster: Nonlinear Metric Learning »
Dor Kedem · Stephen Tyree · Kilian Q Weinberger · Fei Sha · Gert Lanckriet 
2011 Poster: CoTraining for Domain Adaptation »
Minmin Chen · Kilian Q Weinberger · John Blitzer 
2010 Session: Oral Session 16 »
Kilian Q Weinberger 
2010 Poster: Sparse Coding for Learning Interpretable SpatioTemporal Primitives »
Taehwan Kim · Greg Shakhnarovich · Raquel Urtasun 
2010 Poster: Large Margin MultiTask Metric Learning »
Shibin Parameswaran · Kilian Q Weinberger 
2010 Poster: Decoding Ipsilateral Finger Movements from ECoG Signals in Humans »
Yuzong Liu · Mohit Sharma · Charles M Gaona · Jonathan D Breshears · jarod Roland · zachary V Freudenburg · Kilian Q Weinberger · Eric C Leuthardt 
2009 Poster: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell 
2009 Spotlight: Learning to Hash with Binary Reconstructive Embeddings »
Brian Kulis · Trevor Darrell 
2008 Poster: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle 
2008 Spotlight: Large Margin Taxonomy Embedding for Document Categorization »
Kilian Q Weinberger · Olivier Chapelle 
2006 Workshop: Novel Applications of Dimensionality Reduction »
John Blitzer · Rajarshi Das · Irina Rish · Kilian Q Weinberger 
2006 Poster: Graph Regularization for Maximum Variance Unfolding with an Application to Sensor Localization »
Kilian Q Weinberger · Fei Sha · Qihui Zhu · Lawrence Saul