Timezone: »

On Symmetries in Variational Bayesian Neural Nets
Richard Kurle · Tim Januschowski · Jan Gasthaus · Bernie Wang
Event URL: https://openreview.net/forum?id=o1rbV-8wfCN »

Probabilistic inference of Neural Network parameters is challenging due to the highly multi-modal likelihood functions. Most importantly, the permutation invariance of the neurons of the hidden layers renders the likelihood function unidentifiable with a factorial number of equivalent (symmetric) modes, independent of the data. We show that variational Bayesian methods that approximate the (multi-modal) posterior by a (uni-modal) Gaussian distribution are biased towards approximations with identical (e.g. zero-centred) weights, resulting in severe underfitting.This explains the common empirical observation that, in contrast to MCMC methods, variational approximations typically collapse most weights to the (zero-centred) prior.We propose a simple modification to the likelihood function that breaks the symmetry using fixed semi-orthogonal matrices as skip connections in each layer.Initial empirical results show an improved predictive performance.

Author Information

Richard Kurle (AWS AI Labs)
Tim Januschowski (Amazon Research)
Jan Gasthaus (Amazon / AWS)
Bernie Wang (AWS AI Labs)

More from the Same Authors