Timezone: »
Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorithm for Bayesian learning from large scale datasets. While SGLD with decreasing step sizes converges weakly to the posterior distribution, the algorithm is often used with a constant step size in practice and has demonstrated spectacular successes in machine learning tasks. The current practice is to set the step size inversely proportional to N where N is the number of training samples. As N becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like as Stochastic Gradient Descent (SGD). This difference is inherently due to the high variance of the stochastic gradients. Several strategies have been suggested to reduce this effect; among them, SGLD Fixed Point (SGLDFP) uses carefully designed control variates to reduce the variance of the stochastic gradients. We show that SGLDFP gives approximate samples from the posterior distribution, with an accuracy comparable to the Langevin Monte Carlo (LMC) algorithm for a computational cost sublinear in the number of data points. We provide a detailed analysis of the Wasserstein distances between LMC, SGLD, SGLDFP and SGD and explicit expressions of the means and covariance matrices of their invariant distributions. Our findings are supported by limited numerical experiments.
Author Information
Nicolas Brosse (Ecole Polytechnique, Palaiseau, FRANCE)
Alain Durmus (ENS)
Eric Moulines (Ecole Polytechnique)
More from the Same Authors
-
2021 Poster: Federated-EM with heterogeneity mitigation and variance reduction »
Aymeric Dieuleveut · Gersende Fort · Eric Moulines · Geneviève Robin -
2021 Poster: NEO: Non Equilibrium Sampling on the Orbits of a Deterministic Transform »
Achille Thin · Yazid Janati El Idrissi · Sylvain Le Corff · Charles Ollion · Eric Moulines · Arnaud Doucet · Alain Durmus · Christian X Robert -
2021 Poster: Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections »
Kimia Nadjahi · Alain Durmus · Pierre E Jacob · Roland Badeau · Umut Simsekli -
2021 Poster: Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize »
Alain Durmus · Eric Moulines · Alexey Naumov · Sergey Samsonov · Kevin Scaman · Hoi-To Wai -
2020 Poster: A Stochastic Path Integral Differential EstimatoR Expectation Maximization Algorithm »
Gersende Fort · Eric Moulines · Hoi-To Wai -
2019 Spotlight: Asymptotic Guarantees for Learning Generative Models with the Sliced-Wasserstein Distance »
Kimia Nadjahi · Alain Durmus · Umut Simsekli · Roland Badeau -
2019 Poster: On the Global Convergence of (Fast) Incremental Expectation Maximization Methods »
Belhal Karimi · Hoi-To Wai · Eric Moulines · Marc Lavielle -
2018 Poster: Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames »
Geneviève Robin · Hoi-To Wai · Julie Josse · Olga Klopp · Eric Moulines -
2018 Spotlight: Low-rank Interaction with Sparse Additive Effects Model for Large Data Frames »
Geneviève Robin · Hoi-To Wai · Julie Josse · Olga Klopp · Eric Moulines