We will re-examine two popular use-cases of Bayesian approaches: model selection, and robustness to distribution shifts.
The marginal likelihood (Bayesian evidence) provides a distinctive approach to resolving foundational scientific questions --- "how can we choose between models that are entirely consistent with any data?" and "how can we learn hyperparameters or correct ground truth constraints, such as intrinsic dimensionalities, or symmetries, if our training loss doesn't select for them?". There are compelling arguments that the marginal likelihood automatically encodes Occam's razor. There are also widespread practical applications, including the variational ELBO for hyperparameter learning. However, we will discuss how the marginal likelihood is answering a fundamentally different question than "will my trained model provide good generalization?". We consider the discrepancies and their significant practical implications in detail, as well as possible resolutions.
Moreover, it is often thought that Bayesian methods, representing epistemic uncertainty, ought to have more reasonable predictive distributions under covariate shift, since these points will be far from our data manifold. However, we were surprised to find that high quality approximate Bayesian inference often leads to significantly decreased generalization performance. To understand these findings, we investigate fundamentally why Bayesian model averaging can deteriorate predictive performance under distribution and covariate shifts, and provide several remedies based on this understanding.
Andrew Gordon Wilson (New York University)
More from the Same Authors
2021 : Robust Reinforcement Learning for Shifting Dynamics During Deployment »
Samuel Stanton · Rasool Fakoor · Jonas Mueller · Andrew Gordon Wilson · Alexander Smola
2022 : On Representation Learning Under Class Imbalance »
Ravid Shwartz-Ziv · Micah Goldblum · Yucen Li · C. Bayan Bruss · Andrew Gordon Wilson
2022 : Andrew Gordon Wilson: When Bayesian Orthodoxy Can Go Wrong: Model Selection and Out-of-Distribution Generalization »
Andrew Gordon Wilson
2021 Workshop: Bayesian Deep Learning »
Yarin Gal · Yingzhen Li · Sebastian Farquhar · Christos Louizos · Eric Nalisnick · Andrew Gordon Wilson · Zoubin Ghahramani · Kevin Murphy · Max Welling
2021 : Evaluating Approximate Inference in Bayesian Deep Learning + Q&A »
Andrew Gordon Wilson · Pavel Izmailov · Matthew Hoffman · Yarin Gal · Yingzhen Li · Melanie F. Pradier · Sharad Vikram · Andrew Foong · Sanae Lotfi · Sebastian Farquhar
2019 Poster: Exact Gaussian Processes on a Million Data Points »
Ke Alexander Wang · Geoff Pleiss · Jacob Gardner · Stephen Tyree · Kilian Weinberger · Andrew Gordon Wilson
2019 Poster: Function-Space Distributions over Kernels »
Gregory Benton · Wesley Maddox · Jayson Salkey · Julio Albinati · Andrew Gordon Wilson
2019 Poster: A Simple Baseline for Bayesian Uncertainty in Deep Learning »
Wesley Maddox · Pavel Izmailov · Timur Garipov · Dmitry Vetrov · Andrew Gordon Wilson