Timezone: »
Embedding complex objects as vectors in low dimensional spaces is a longstanding problem in machine learning. We propose in this work an extension of that approach, which consists in embedding objects as elliptical probability distributions, namely distributions whose densities have elliptical level sets. We endow these measures with the 2-Wasserstein metric, with two important benefits: (i) For such measures, the squared 2-Wasserstein metric has a closed form, equal to a weighted sum of the squared Euclidean distance between means and the squared Bures metric between covariance matrices. The latter is a Riemannian metric between positive semi-definite matrices, which turns out to be Euclidean on a suitable factor representation of such matrices, which is valid on the entire geodesic between these matrices. (ii) The 2-Wasserstein distance boils down to the usual Euclidean metric when comparing Diracs, and therefore provides a natural framework to extend point embeddings. We show that for these reasons Wasserstein elliptical embeddings are more intuitive and yield tools that are better behaved numerically than the alternative choice of Gaussian embeddings with the Kullback-Leibler divergence. In particular, and unlike previous work based on the KL geometry, we learn elliptical distributions that are not necessarily diagonal. We demonstrate the advantages of elliptical embeddings by using them for visualization, to compute embeddings of words, and to reflect entailment or hypernymy.
Author Information
Boris Muzellec (ENSAE ParisTech)
Marco Cuturi (Google Brain & CREST - ENSAE)
Marco Cuturi is a research scientist at Apple, in Paris. He received his Ph.D. in 11/2005 from the Ecole des Mines de Paris in applied mathematics. Before that he graduated from National School of Statistics (ENSAE) with a master degree (MVA) from ENS Cachan. He worked as a post-doctoral researcher at the Institute of Statistical Mathematics, Tokyo, between 11/2005 and 3/2007 and then in the financial industry between 4/2007 and 9/2008. After working at the ORFE department of Princeton University as a lecturer between 2/2009 and 8/2010, he was at the Graduate School of Informatics of Kyoto University between 9/2010 and 9/2016 as a tenured associate professor. He joined ENSAE in 9/2016 as a professor, where he is now working part-time. He was at Google between 10/2018 and 1/2022. His main employment is now with Apple, since 1/2022, as a research scientist working on fundamental aspects of machine learning.
More from the Same Authors
-
2021 : Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2021 : Linear-Time Gromov Wasserstein Distances using Low Rank Couplings and Costs »
Meyer Scetbon · Gabriel Peyré · Marco Cuturi -
2023 : Simulation-based Inference for Cardiovascular Models »
Antoine Wehenkel · Jens Behrmann · Andy Miller · Guillermo Sapiro · Ozan Sener · Marco Cuturi · Joern-Henrik Jacobsen -
2023 Workshop: Optimal Transport and Machine Learning »
Anna Korba · Aram-Alexandre Pooladian · Charlotte Bunne · David Alvarez-Melis · Marco Cuturi · Ziv Goldfeld -
2023 Poster: Unbalanced Low-rank Optimal Transport Solvers »
Meyer Scetbon · Michal Klein · Giovanni Palla · Marco Cuturi -
2022 Poster: Supervised Training of Conditional Monge Maps »
Charlotte Bunne · Andreas Krause · Marco Cuturi -
2022 Poster: Efficient and Modular Implicit Differentiation »
Mathieu Blondel · Quentin Berthet · Marco Cuturi · Roy Frostig · Stephan Hoyer · Felipe Llinares-Lopez · Fabian Pedregosa · Jean-Philippe Vert -
2022 Poster: FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings »
Jean Ogier du Terrail · Samy-Safwan Ayed · Edwige Cyffers · Felix Grimberg · Chaoyang He · Regis Loeb · Paul Mangold · Tanguy Marchand · Othmane Marfoq · Erum Mushtaq · Boris Muzellec · Constantin Philippenko · Santiago Silva · Maria Teleńczuk · Shadi Albarqouni · Salman Avestimehr · Aurélien Bellet · Aymeric Dieuleveut · Martin Jaggi · Sai Praneeth Karimireddy · Marco Lorenzi · Giovanni Neglia · Marc Tommasi · Mathieu Andreux -
2022 Poster: Low-rank Optimal Transport: Approximation, Statistics and Debiasing »
Meyer Scetbon · Marco Cuturi -
2022 Poster: SecureFedYJ: a safe feature Gaussianization protocol for Federated Learning »
Tanguy Marchand · Boris Muzellec · Constance Béguier · Jean Ogier du Terrail · Mathieu Andreux -
2021 Workshop: Optimal Transport and Machine Learning »
Jason Altschuler · Charlotte Bunne · Laetitia Chapel · Marco Cuturi · Rémi Flamary · Gabriel Peyré · Alexandra Suvorikova -
2020 Poster: Projection Robust Wasserstein Distance and Riemannian Optimization »
Tianyi Lin · Chenyou Fan · Nhat Ho · Marco Cuturi · Michael Jordan -
2020 Poster: Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast Algorithm »
Tianyi Lin · Nhat Ho · Xi Chen · Marco Cuturi · Michael Jordan -
2020 Spotlight: Projection Robust Wasserstein Distance and Riemannian Optimization »
Tianyi Lin · Chenyou Fan · Nhat Ho · Marco Cuturi · Michael Jordan -
2020 Poster: Learning with Differentiable Pertubed Optimizers »
Quentin Berthet · Mathieu Blondel · Olivier Teboul · Marco Cuturi · Jean-Philippe Vert · Francis Bach -
2020 Poster: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2020 Poster: Linear Time Sinkhorn Divergences using Positive Features »
Meyer Scetbon · Marco Cuturi -
2020 Oral: Entropic Optimal Transport between Unbalanced Gaussian Measures has a Closed Form »
Hicham Janati · Boris Muzellec · Gabriel Peyré · Marco Cuturi -
2020 Session: Orals & Spotlights Track 21: Optimization »
Peter Richtarik · Marco Cuturi -
2019 Workshop: Optimal Transport for Machine Learning »
Marco Cuturi · Gabriel Peyré · Rémi Flamary · Alexandra Suvorikova -
2019 Poster: Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections »
Boris Muzellec · Marco Cuturi -
2019 Poster: Differentiable Ranking and Sorting using Optimal Transport »
Marco Cuturi · Olivier Teboul · Jean-Philippe Vert -
2019 Spotlight: Differentiable Ranking and Sorting using Optimal Transport »
Marco Cuturi · Olivier Teboul · Jean-Philippe Vert -
2019 Poster: Tree-Sliced Variants of Wasserstein Distances »
Tam Le · Makoto Yamada · Kenji Fukumizu · Marco Cuturi -
2018 Poster: Large Scale computation of Means and Clusters for Persistence Diagrams using Optimal Transport »
Theo Lacombe · Marco Cuturi · Steve OUDOT -
2017 Workshop: Optimal Transport and Machine Learning »
Olivier Bousquet · Marco Cuturi · Gabriel Peyré · Fei Sha · Justin Solomon -
2017 Tutorial: A Primer on Optimal Transport »
Marco Cuturi · Justin Solomon -
2016 Workshop: Time Series Workshop »
Oren Anava · Marco Cuturi · Azadeh Khaleghi · Vitaly Kuznetsov · Sasha Rakhlin -
2016 Poster: Wasserstein Training of Restricted Boltzmann Machines »
Grégoire Montavon · Klaus-Robert Müller · Marco Cuturi -
2016 Poster: Stochastic Optimization for Large-scale Optimal Transport »
Aude Genevay · Marco Cuturi · Gabriel Peyré · Francis Bach -
2015 Poster: Principal Geodesic Analysis for Probability Measures under the Optimal Transport Metric »
Vivien Seguy · Marco Cuturi -
2014 Workshop: Optimal Transport and Machine Learning »
Marco Cuturi · Gabriel Peyré · Justin Solomon · Alexander Barvinok · Piotr Indyk · Robert McCann · Adam Oberman -
2013 Poster: Sinkhorn Distances: Lightspeed Computation of Optimal Transport »
Marco Cuturi -
2013 Spotlight: Sinkhorn Distances: Lightspeed Computation of Optimal Transport »
Marco Cuturi -
2009 Poster: White Functionals for Anomaly Detection in Dynamical Systems »
Marco Cuturi · Jean-Philippe Vert · Alexandre d'Aspremont -
2006 Poster: Kernels on Structured Objects Through Nested Histograms »
Marco Cuturi · Kenji Fukumizu