Timezone: »
Learning a sequence of tasks without access to i.i.d. observations is a widely studied form of continual learning (CL) that remains challenging. In principle, Bayesian learning directly applies to this setting, since recursive and one-off Bayesian updates yield the same result. In practice, however, recursive updating often leads to poor trade-off solutions across tasks because approximate inference is necessary for most models of interest. Here, we describe an alternative Bayesian approach where task-conditioned parameter distributions are continually inferred from data. We offer a practical deep learning implementation of our framework based on probabilistic task-conditioned hypernetworks, an approach we term posterior meta-replay. Experiments on standard benchmarks show that our probabilistic hypernetworks compress sequences of posterior parameter distributions with virtually no forgetting. We obtain considerable performance gains compared to existing Bayesian CL methods, and identify task inference as our major limiting factor. This limitation has several causes that are independent of the considered sequential setting, opening up new avenues for progress in CL.
Author Information
Christian Henning (ETH Zurich)
Maria Cervera (Swiss Federal Institute of Technology)
Francesco D'Angelo (Swiss Federal Institute of Technology)
Johannes von Oswald (ETH Zurich)
Regina Traber (University of Zurich)
Benjamin Ehret (Swiss Federal Institute of Technology)
Seijin Kobayashi (ETHZ)
Benjamin F. Grewe (ETH Zurich)
João Sacramento (ETH Zurich)
More from the Same Authors
-
2021 Spotlight: Repulsive Deep Ensembles are Bayesian »
Francesco D'Angelo · Vincent Fortuin -
2021 Spotlight: Credit Assignment in Neural Networks through Deep Feedback Control »
Alexander Meulemans · Matilde Tristany Farinha · Javier Garcia Ordonez · Pau Vilimelis Aceituno · João Sacramento · Benjamin F. Grewe -
2021 : Uncertainty estimation under model misspecification in neural network regression »
Maria Cervera · Rafael Dätwyler · Francesco D'Angelo · Hamza Keurti · Benjamin F. Grewe · Christian Henning -
2022 : Random initialisations performing above chance and how to find them »
Frederik Benzing · Simon Schug · Robert Meier · Johannes von Oswald · Yassir Akram · Nicolas Zucchet · Laurence Aitchison · Angelika Steger -
2022 : Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions »
Hamza Keurti · Hsiao-Ru Pan · Michel Besserve · Benjamin F. Grewe · Bernhard Schölkopf -
2022 : Meta-Learning via Classifier(-free) Guidance »
Elvis Nava · Seijin Kobayashi · Yifei Yin · Robert Katzschmann · Benjamin F. Grewe -
2023 Poster: Would I have gotten that reward? Long-term credit assignment by counterfactual contribution analysis »
Alexander Meulemans · Simon Schug · Seijin Kobayashi · nathaniel daw · Gregory Wayne -
2022 : Panel »
Tyler Hayes · Tinne Tuytelaars · Subutai Ahmad · João Sacramento · Zsolt Kira · Hava Siegelmann · Christopher Summerfield -
2022 : Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions »
Hamza Keurti · Hsiao-Ru Pan · Michel Besserve · Benjamin F. Grewe · Bernhard Schölkopf -
2022 Poster: A contrastive rule for meta-learning »
Nicolas Zucchet · Simon Schug · Johannes von Oswald · Dominic Zhao · João Sacramento -
2022 Poster: The least-control principle for local learning at equilibrium »
Alexander Meulemans · Nicolas Zucchet · Seijin Kobayashi · Johannes von Oswald · João Sacramento -
2022 Poster: Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel »
Seijin Kobayashi · Pau Vilimelis Aceituno · Johannes von Oswald -
2021 : Uncertainty estimation under model misspecification in neural network regression »
Maria Cervera -
2021 Poster: Repulsive Deep Ensembles are Bayesian »
Francesco D'Angelo · Vincent Fortuin -
2021 Poster: Credit Assignment in Neural Networks through Deep Feedback Control »
Alexander Meulemans · Matilde Tristany Farinha · Javier Garcia Ordonez · Pau Vilimelis Aceituno · João Sacramento · Benjamin F. Grewe -
2021 Poster: Learning where to learn: Gradient sparsity in meta and continual learning »
Johannes von Oswald · Dominic Zhao · Seijin Kobayashi · Simon Schug · Massimo Caccia · Nicolas Zucchet · João Sacramento -
2020 Poster: A Theoretical Framework for Target Propagation »
Alexander Meulemans · Francesco Carzaniga · Johan Suykens · João Sacramento · Benjamin F. Grewe -
2020 Spotlight: A Theoretical Framework for Target Propagation »
Alexander Meulemans · Francesco Carzaniga · Johan Suykens · João Sacramento · Benjamin F. Grewe -
2018 : Poster Session 1 »
Stefan Gadatsch · Danil Kuzin · Navneet Kumar · Patrick Dallaire · Tom Ryder · Remus-Petru Pop · Nathan Hunt · Adam Kortylewski · Sophie Burkhardt · Mahmoud Elnaggar · Dieterich Lawson · Yifeng Li · Jongha (Jon) Ryu · Juhan Bae · Micha Livne · Tim Pearce · Mariia Vladimirova · Jason Ramapuram · Jiaming Zeng · Xinyu Hu · Jiawei He · Danielle Maddix · Arunesh Mittal · Albert Shaw · Tuan Anh Le · Alexander Sagel · Lisha Chen · Victor Gallego · Mahdi Karami · Zihao Zhang · Tal Kachman · Noah Weber · Matt Benatan · Kumar K Sricharan · Vincent Cartillier · Ivan Ovinnikov · Buu Phan · Mahmoud Hossam · Liu Ziyin · Valerii Kharitonov · Eugene Golikov · Qiang Zhang · Jae Myung Kim · Sebastian Farquhar · Jishnu Mukhoti · Xu Hu · Gregory Gundersen · Lavanya Sita Tekumalla · Paris Perdikaris · Ershad Banijamali · Siddhartha Jain · Ge Liu · Martin Gottwald · Katy Blumer · Sukmin Yun · Ranganath Krishnan · Roman Novak · Yilun Du · Yu Gong · Beliz Gokkaya · Jessica Ai · Daniel Duckworth · Johannes von Oswald · Christian Henning · Louis-Philippe Morency · Ali Ghodsi · Mahesh Subedar · Jean-Pascal Pfister · Rémi Lebret · Chao Ma · Aleksander Wieczorek · Laurence Perreault Levasseur