NeurIPS Improved Stein Variational Gradient Descent with Importance Weights

Poster
in
Workshop: Optimal Transport and Machine Learning

Improved Stein Variational Gradient Descent with Importance Weights

Lukang Sun · Peter Richtarik

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: Stein Variational Gradient Descent~(\algname{SVGD}) is a popular sampling algorithm used in various machine learning tasks. It is well known that \algname{SVGD} arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence

\KL(⋅∣π)

$\KL\left(\cdot\mid\pi\right)$ , where

$\pi$ is the target distribution. In this work, we propose to enhance \algname{SVGD} via the introduction of {\em importance weights}, which leads to a new method for which we coin the name \algname{

$\beta$ -SVGD}. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution

$\pi$ , quantified by the Stein Fisher information, depends on

$\rho_0$ and

$\pi$ very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on

$\KL\left(\rho_0\mid\pi\right)$ . Under certain assumptions, we provide a descent lemma for the population limit \algname{

$\beta$ -SVGD}, which covers the descent lemma for the population limit \algname{SVGD} when

$\beta\to 0$ . We also illustrate the advantages of \algname{

$\beta$ -SVGD} over \algname{SVGD} by experiments.

Chat is not available.

Poster in Workshop: Optimal Transport and Machine Learning

Improved Stein Variational Gradient Descent with Importance Weights

Lukang Sun · Peter Richtarik

Poster
in
Workshop: Optimal Transport and Machine Learning