Timezone: »
Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive uncertainty. Quantifying uncertainty is especially critical in real-world settings, which often involve input distributions that are shifted from the training distribution due to a variety of factors including sample bias and non-stationarity. In such settings, well calibrated uncertainty estimates convey information about when a model's output should (or should not) be trusted. Many probabilistic deep learning methods, including Bayesian-and non-Bayesian methods, have been proposed in the literature for quantifying predictive uncertainty, but to our knowledge there has not previously been a rigorous large-scale empirical comparison of these methods under dataset shift. We present a large-scale benchmark of existing state-of-the-art methods on classification problems and investigate the effect of dataset shift on accuracy and calibration. We find that traditional post-hoc calibration does indeed fall short, as do several other previous methods. However, some methods that marginalize over models give surprisingly strong results across a broad spectrum of tasks.
Author Information
Jasper Snoek (Google Brain)
Yaniv Ovadia (Princeton University)
Emily Fertig (Google Research)
Balaji Lakshminarayanan (Google DeepMind)
Sebastian Nowozin (Google Research Berlin)
D. Sculley (Google Research)
Joshua Dillon (Google)
Jie Ren (Google Inc.)
Zachary Nado (Google Inc.)
More from the Same Authors
-
2021 Spotlight: Precise characterization of the prior predictive distribution of deep ReLU networks »
Lorenzo Noci · Gregor Bachmann · Kevin Roth · Sebastian Nowozin · Thomas Hofmann -
2021 : Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks »
Neil Band · Tim G. J. Rudner · Qixuan Feng · Angelos Filos · Zachary Nado · Mike Dusenberry · Ghassen Jerfel · Dustin Tran · Yarin Gal -
2021 : Understanding and Improving Robustness of VisionTransformers through patch-based NegativeAugmentation »
Yao Qin · Chiyuan Zhang · Ting Chen · Balaji Lakshminarayanan · Alex Beutel · Xuezhi Wang -
2021 : BEDS-Bench: Behavior of EHR-models under Distributional Shift - A Benchmark »
Anand Avati · Martin Seneviratne · Yuan Xue · Zhen Xu · Balaji Lakshminarayanan · Andrew Dai -
2021 : Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift »
Kehang Han · Balaji Lakshminarayanan · Jeremiah Liu -
2021 : Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks »
Neil Band · Tim G. J. Rudner · Qixuan Feng · Angelos Filos · Zachary Nado · Mike Dusenberry · Ghassen Jerfel · Dustin Tran · Yarin Gal -
2021 : PAC^m-Bayes: Narrowing the Empirical Risk Gap in the Misspecified Bayesian Regime »
Joshua Dillon · Warren Morningstar · Alexander Alemi -
2021 : Model-embedding flows: Combining the inductive biases of model-free deep learning and explicit probabilistic modeling »
Gianluigi Silvestri · Emily Fertig · Dave Moore · Luca Ambrogioni -
2021 : Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning »
Zachary Nado · Neil Band · Mark Collier · Josip Djolonga · Mike Dusenberry · Sebastian Farquhar · Qixuan Feng · Angelos Filos · Marton Havasi · Rodolphe Jenatton · Ghassen Jerfel · Jeremiah Liu · Zelda Mariet · Jeremy Nixon · Shreyas Padhy · Jie Ren · Tim G. J. Rudner · Yeming Wen · Florian Wenzel · Kevin Murphy · D. Sculley · Balaji Lakshminarayanan · Jasper Snoek · Yarin Gal · Dustin Tran -
2021 : Deep Classifiers with Label Noise Modeling and Distance Awareness »
Vincent Fortuin · Mark Collier · Florian Wenzel · James Allingham · Jeremiah Liu · Dustin Tran · Balaji Lakshminarayanan · Jesse Berent · Rodolphe Jenatton · Effrosyni Kokiopoulou -
2021 : Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks »
Neil Band · Tim G. J. Rudner · Qixuan Feng · Angelos Filos · Zachary Nado · Mike Dusenberry · Ghassen Jerfel · Dustin Tran · Yarin Gal -
2022 : Out-of-Distribution Detection and Selective Generation for Conditional Language Models »
Jie Ren · Jiaming Luo · Yao Zhao · Kundan Krishna · Mohammad Saleh · Balaji Lakshminarayanan · Peter Liu -
2022 : Reliability benchmarks for image segmentation »
Estefany Kelly Buchanan · Michael Dusenberry · Jie Ren · Kevin Murphy · Balaji Lakshminarayanan · Dustin Tran -
2022 : Pushing the Accuracy-Fairness Tradeoff Frontier with Introspective Self-play »
Jeremiah Liu · Krishnamurthy Dvijotham · Jihyeon Lee · Quan Yuan · Martin Strobel · Balaji Lakshminarayanan · Deepak Ramachandran -
2022 : Contextual Squeeze-and-Excitation »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2022 : FiT: Parameter Efficient Few-shot Transfer Learning »
Aliaksandra Shysheya · John Bronskill · Massimiliano Patacchiola · Sebastian Nowozin · Richard Turner -
2022 : Improving Zero-shot Generalization and Robustness of Multi-modal Models »
Yunhao Ge · Jie Ren · Ming-Hsuan Yang · Yuxiao Wang · Andrew Gallagher · Hartwig Adam · Laurent Itti · Balaji Lakshminarayanan · Jiaping Zhao -
2022 : Improving the Robustness of Conditional Language Models by Detecting and Removing Input Noise »
Kundan Krishna · Yao Zhao · Jie Ren · Balaji Lakshminarayanan · Jiaming Luo · Mohammad Saleh · Peter Liu -
2023 Poster: Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics »
Leon Klein · Andrew Foong · Tor Fjelde · Bruno Mlodozeniec · Marc Brockschmidt · Sebastian Nowozin · Frank Noe · Ryota Tomioka -
2023 Poster: DataPerf: Benchmarks for Data-Centric AI Development »
Mark Mazumder · Colby Banbury · Xiaozhe Yao · Bojan Karlaš · William Gaviria Rojas · Sudnya Diamos · Greg Diamos · Lynn He · Alicia Parrish · Hannah Rose Kirk · Jessica Quaye · Charvi Rastogi · Douwe Kiela · David Jurado · David Kanter · Rafael Mosquera · Will Cukierski · Juan Ciro · Lora Aroyo · Bilge Acun · Lingjiao Chen · Mehul Raje · Max Bartolo · Evan Sabri Eyuboglu · Amirata Ghorbani · Emmett Goodman · Addison Howard · Oana Inel · Tariq Kane · Christine R. Kirkpatrick · D. Sculley · Tzu-Sheng Kuo · Jonas Mueller · Tristan Thrush · Joaquin Vanschoren · Margaret Warren · Adina Williams · Serena Yeung · Newsha Ardalani · Praveen Paritosh · Ce Zhang · James Zou · Carole-Jean Wu · Cody Coleman · Andrew Ng · Peter Mattson · Vijay Janapa Reddi -
2022 : Out-of-Distribution Detection and Selective Generation for Conditional Language Models »
Jie Ren · Jiaming Luo · Yao Zhao · Kundan Krishna · Mohammad Saleh · Balaji Lakshminarayanan · Peter Liu -
2022 : Benchmarking Trainng Algorithms by Zachary Nado »
Zachary Nado -
2022 Workshop: Has it Trained Yet? A Workshop for Algorithmic Efficiency in Practical Neural Network Training »
Frank Schneider · Zachary Nado · Philipp Hennig · George Dahl · Naman Agarwal -
2022 Poster: Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation »
Yao Qin · Chiyuan Zhang · Ting Chen · Balaji Lakshminarayanan · Alex Beutel · Xuezhi Wang -
2022 Poster: Contextual Squeeze-and-Excitation for Efficient Few-Shot Image Classification »
Massimiliano Patacchiola · John Bronskill · Aliaksandra Shysheya · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2021 : Technical Debt in ML: A Data-Centric View »
D. Sculley -
2021 : Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks »
Neil Band · Tim G. J. Rudner · Qixuan Feng · Angelos Filos · Zachary Nado · Mike Dusenberry · Ghassen Jerfel · Dustin Tran · Yarin Gal -
2021 Poster: Exploring the Limits of Out-of-Distribution Detection »
Stanislav Fort · Jie Ren · Balaji Lakshminarayanan -
2021 Poster: Precise characterization of the prior predictive distribution of deep ReLU networks »
Lorenzo Noci · Gregor Bachmann · Kevin Roth · Sebastian Nowozin · Thomas Hofmann -
2021 Poster: Soft Calibration Objectives for Neural Networks »
Archit Karandikar · Nicholas Cain · Dustin Tran · Balaji Lakshminarayanan · Jonathon Shlens · Michael Mozer · Becca Roelofs -
2021 Poster: Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect »
Lorenzo Noci · Kevin Roth · Gregor Bachmann · Sebastian Nowozin · Thomas Hofmann -
2021 Poster: Memory Efficient Meta-Learning with Large Images »
John Bronskill · Daniela Massiceti · Massimiliano Patacchiola · Katja Hofmann · Sebastian Nowozin · Richard Turner -
2020 Poster: Bayesian Deep Ensembles via the Neural Tangent Kernel »
Bobby He · Balaji Lakshminarayanan · Yee Whye Teh -
2020 Poster: Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness »
Jeremiah Liu · Zi Lin · Shreyas Padhy · Dustin Tran · Tania Bedrax Weiss · Balaji Lakshminarayanan -
2020 Tutorial: (Track2) Practical Uncertainty Estimation and Out-of-Distribution Robustness in Deep Learning Q&A »
Dustin Tran · Balaji Lakshminarayanan · Jasper Snoek -
2020 Tutorial: (Track2) Practical Uncertainty Estimation and Out-of-Distribution Robustness in Deep Learning »
Dustin Tran · Balaji Lakshminarayanan · Jasper Snoek -
2019 : Coffee Break and Poster Session »
Rameswar Panda · Prasanna Sattigeri · Kush Varshney · Karthikeyan Natesan Ramamurthy · Harvineet Singh · Vishwali Mhasawade · Shalmali Joshi · Laleh Seyyed-Kalantari · Matthew McDermott · Gal Yona · James Atwood · Hansa Srinivasan · Yonatan Halpern · D. Sculley · Behrouz Babaki · Margarida Carvalho · Josie Williams · Narges Razavian · Haoran Zhang · Amy Lu · Irene Y Chen · Xiaojie Mao · Angela Zhou · Nathan Kallus -
2019 Workshop: Program Transformations for ML »
Pascal Lamblin · Atilim Gunes Baydin · Alexander Wiltschko · Bart van Merriënboer · Emily Fertig · Barak Pearlmutter · David Duvenaud · Laurent Hascoet -
2019 Workshop: Learning Meaningful Representations of Life »
Elizabeth Wood · Yakir Reshef · Jonathan Bloom · Jasper Snoek · Barbara Engelhardt · Scott Linderman · Suchi Saria · Alexander Wiltschko · Casey Greene · Chang Liu · Kresten Lindorff-Larsen · Debora Marks -
2019 Poster: Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model »
Guodong Zhang · Lala Li · Zachary Nado · James Martens · Sushant Sachdeva · George Dahl · Chris Shallue · Roger Grosse -
2019 Poster: Icebreaker: Element-wise Efficient Information Acquisition with a Bayesian Deep Latent Gaussian Model »
Wenbo Gong · Sebastian Tschiatschek · Sebastian Nowozin · Richard Turner · José Miguel Hernández-Lobato · Cheng Zhang -
2019 Poster: Fast and Flexible Multi-Task Classification using Conditional Neural Adaptive Processes »
James Requeima · Jonathan Gordon · John Bronskill · Sebastian Nowozin · Richard Turner -
2019 Spotlight: Fast and Flexible Multi-Task Classification using Conditional Neural Adaptive Processes »
James Requeima · Jonathan Gordon · John Bronskill · Sebastian Nowozin · Richard Turner -
2019 Poster: Likelihood Ratios for Out-of-Distribution Detection »
Jie Ren · Peter Liu · Emily Fertig · Jasper Snoek · Ryan Poplin · Mark Depristo · Joshua Dillon · Balaji Lakshminarayanan -
2019 Poster: DppNet: Approximating Determinantal Point Processes with Deep Networks »
Zelda Mariet · Yaniv Ovadia · Jasper Snoek -
2018 : On Avoiding Tragedy of the Commons in the Peer Review Process »
D. Sculley -
2018 : Sebastian Nowozin »
Sebastian Nowozin -
2018 : TBC 8 »
Balaji Lakshminarayanan -
2018 : Poster Session 1 (note there are numerous missing names here, all papers appear in all poster sessions) »
Akhilesh Gotmare · Kenneth Holstein · Jan Brabec · Michal Uricar · Kaleigh Clary · Cynthia Rudin · Sam Witty · Andrew Ross · Shayne O'Brien · Babak Esmaeili · Jessica Forde · Massimo Caccia · Ali Emami · Scott Jordan · Bronwyn Woods · D. Sculley · Rebekah Overdorf · Nicolas Le Roux · Peter Henderson · Brandon Yang · Tzu-Yu Liu · David Jensen · Niccolo Dalmasso · Weitang Liu · Paul Marc TRICHELAIR · Jun Ki Lee · Akanksha Atrey · Matt Groh · Yotam Hechtlinger · Emma Tosch -
2018 : InclusiveImages: Competitor Presentations »
Yonatan Halpern · Pallavi Baljekar · D. Sculley · Pavel Ostyakov · Nawazuddin Mohammed · Weimin Wang · David Austin -
2018 Workshop: Smooth Games Optimization and Machine Learning »
Simon Lacoste-Julien · Ioannis Mitliagkas · Gauthier Gidel · Vasilis Syrgkanis · Eva Tardos · Leon Bottou · Sebastian Nowozin -
2018 Workshop: Workshop on Ethical, Social and Governance Issues in AI »
Chloe Bakalar · Sarah Bird · Tiberio Caetano · Edward W Felten · Dario Garcia · Isabel Kloumann · Finnian Lattimore · Sendhil Mullainathan · D. Sculley -
2017 Workshop: Machine Learning for Health (ML4H) - What Parts of Healthcare are Ripe for Disruption by Machine Learning Right Now? »
Jason Fries · Alex Wiltschko · Andrew Beam · Isaac S Kohane · Jasper Snoek · Peter Schulam · Madalina Fiterau · David Kale · Rajesh Ranganath · Bruno Jedynak · Michael Hughes · Tristan Naumann · Natalia Antropova · Adrian Dalca · SHUBHI ASTHANA · Prateek Tandon · Jaz Kandola · Uri Shalit · Marzyeh Ghassemi · Tim Althoff · Alexander Ratner · Jumana Dakka -
2017 Poster: The Numerics of GANs »
Lars Mescheder · Sebastian Nowozin · Andreas Geiger -
2017 Poster: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles »
Balaji Lakshminarayanan · Alexander Pritzel · Charles Blundell -
2017 Spotlight: Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles »
Balaji Lakshminarayanan · Alexander Pritzel · Charles Blundell -
2017 Spotlight: The Numerics of GANs »
Lars Mescheder · Sebastian Nowozin · Andreas Geiger -
2017 Poster: Stabilizing Training of Generative Adversarial Networks through Regularization »
Kevin Roth · Aurelien Lucchi · Sebastian Nowozin · Thomas Hofmann -
2016 : TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning »
D. Sculley -
2016 : Discussion panel »
Ian Goodfellow · Soumith Chintala · Arthur Gretton · Sebastian Nowozin · Aaron Courville · Yann LeCun · Emily Denton -
2016 : Training Generative Neural Samplers using Variational Divergence »
Sebastian Nowozin -
2016 : What's your ML Test Score? A rubric for ML production systems »
D. Sculley -
2016 Poster: f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization »
Sebastian Nowozin · Botond Cseke · Ryota Tomioka -
2016 Poster: DISCO Nets : DISsimilarity COefficients Networks »
Diane Bouchacourt · Pawan K Mudigonda · Sebastian Nowozin -
2015 : Mondrian Forests for Large-Scale regression when uncertainty matters »
Balaji Lakshminarayanan -
2015 Poster: Spectral Representations for Convolutional Neural Networks »
Oren Rippel · Jasper Snoek · Ryan Adams -
2015 Poster: Hidden Technical Debt in Machine Learning Systems »
D. Sculley · Gary Holt · Daniel Golovin · Eugene Davydov · Todd Phillips · Dietmar Ebner · Vinay Chaudhary · Michael Young · Jean-François Crespo · Dan Dennison -
2014 Workshop: Discrete Optimization in Machine Learning »
Jeffrey A Bilmes · Andreas Krause · Stefanie Jegelka · S Thomas McCormick · Sebastian Nowozin · Yaron Singer · Dhruv Batra · Volkan Cevher -
2014 Workshop: Bayesian Optimization in Academia and Industry »
Zoubin Ghahramani · Ryan Adams · Matthew Hoffman · Kevin Swersky · Jasper Snoek -
2014 Poster: Distributed Bayesian Posterior Sampling via Moment Sharing »
Minjie Xu · Balaji Lakshminarayanan · Yee Whye Teh · Jun Zhu · Bo Zhang -
2014 Session: Oral Session 2 »
D. Sculley -
2014 Poster: Mondrian Forests: Efficient Online Random Forests »
Balaji Lakshminarayanan · Daniel Roy · Yee Whye Teh -
2013 Workshop: Bayesian Optimization in Theory and Practice »
Matthew Hoffman · Jasper Snoek · Nando de Freitas · Michael A Osborne · Ryan Adams · Sebastien Bubeck · Philipp Hennig · Remi Munos · Andreas Krause -
2013 Poster: Decision Jungles: Compact and Rich Models for Classification »
Jamie Shotton · Toby Sharp · Pushmeet Kohli · Sebastian Nowozin · John Winn · Antonio Criminisi -
2013 Poster: Multi-Task Bayesian Optimization »
Kevin Swersky · Jasper Snoek · Ryan Adams -
2013 Poster: A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data »
Jasper Snoek · Richard Zemel · Ryan Adams -
2012 Poster: Practical Bayesian Optimization of Machine Learning Algorithms »
Jasper Snoek · Hugo Larochelle · Ryan Adams -
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin -
2011 Poster: Higher-Order Correlation Clustering for Image Segmentation »
Sungwoong Kim · Sebastian Nowozin · Pushmeet Kohli · Chang D. D Yoo -
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright -
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright -
2008 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Vishwanathan S V N