Almost twenty years ago, Thomas Minka nicely illustrated that Bayesian model averaging (BMA) is different from model combination. Model combination works by enriching the model space, because it considers all possible linear combinations of all the models in the model class, while BMA represents the inability for knowing which is the best single model when using a limited amount data. However, twenty years later, this distinction becomes not so clear in the context of ensembles of deep neural networks: are deep ensembles performing a crude approximation of a highly multi-modal Bayesian posterior? Or, are they exploiting an enriched model space and, in consequence, they should be interpreted in terms of model combination? In this talk, we will introduce recently published theoretical analyses that will shed some light on these questions. As you will see in this talk, whether your model is wrong or not plays a crucial role in the answers to these questions.
Speaker bio: Andres R. Masegosa is an associate professor at the Department of Computer Science at Aalborg University (Copenhagen Campus-Denmark). Previously, he was an assistant professor at the University of Almería (Spain). He got his PhD in Computer Science at the University of Granada in 2009. He is broadly interested in modelling intelligent agents that learn from experience using a probabilistic approach. He has published more than sixty papers in international journals and conferences in the field of machine learning.