Workshop
(Almost) 50 shades of Bayesian Learning: PAC-Bayesian trends and insights
Benjamin Guedj · Pascal Germain · Francis Bach

Sat Dec 9th 08:00 AM -- 06:30 PM @ 101 A
Event URL: https://bguedj.github.io/nips2017/50shadesbayesian.html »

Industry-wide successes of machine learning at the dawn of the (so-called) big data era has led to an increasing gap between practitioners and theoreticians. The former are using off-the-shelf statistical and machine learning methods, while the latter are designing and studying the mathematical properties of such algorithms. The tradeoff between those two movements is somewhat addressed by Bayesian researchers, where sound mathematical guarantees often meet efficient implementation and provide model selection criteria. In the late 90s, a new paradigm has emerged in the statistical learning community, used to derive probably approximately correct (PAC) bounds on Bayesian-flavored estimators. This PAC-Bayesian theory has been pioneered by Shawe-Taylor and Willamson (1997), and McAllester (1998, 1999). It has been extensively formalized by Catoni (2004, 2007) and has triggered, slowly but surely, increasing research efforts during last decades.

We believe it is time to pinpoint the current PAC-Bayesian trends relatively to other modern approaches in the (statistical) machine learning community. Indeed, we observe that, while the field grows by its own, it took some undesirable distance from some related areas. Firstly, it seems to us that the relation to Bayesian methods has been forsaken in numerous works, despite the potential of PAC-Bayesian theory to bring new insights to the Bayesian community and to go beyond the classical Bayesian/frequentist divide. Secondly, the PAC-Bayesian methods share similarities with other quasi-Bayesian (or pseudo-Bayesian) methods studying Bayesian practices from a frequentist standpoint, such as the Minimum Description Length (MDL) principle (Grünwald, 2007). Last but not least, even if some practical and theory grounded learning algorithm has emerged from PAC-Bayesian works, these are almost unused for real-world problems.

In short, this workshop aims at gathering statisticians and machine learning researchers to discuss current trends and the future of {PAC,quasi}-Bayesian learning. From a broader perspective, we aim to bridge the gap between several communities that can all benefit from sharper statistical guarantees and sound theory-driven learning algorithms.

References
[1] J. Shawe-Taylor and R. Williamson. A PAC analysis of a Bayes estimator. In Proceedings of COLT, 1997.
[2] D. A. McAllester. Some PAC-Bayesian theorems. In Proceedings of COLT, 1998.
[3] D. A. McAllester. PAC-Bayesian model averaging. In Proceedings of COLT, 1999.
[4] O. Catoni. Statistical Learning Theory and Stochastic Optimization. Saint-Flour Summer School on Probability Theory 2001 (Jean Picard ed.), Lecture Notes in Mathematics. Springer, 2004.
[5] O. Catoni. PAC-Bayesian supervised classification: the thermodynamics of statistical learning. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 56. Institute of Mathematical Statistics, 2007.
[6] P. D. Grünwald. The Minimum Description Length Principle. The MIT Press, 2007.

08:30 AM François Laviolette - A Tutorial on PAC-Bayesian Theory (Talk) Francois Laviolette
08:30 AM Overture (Opening remarks) Benjamin Guedj, Francis Bach, Pascal Germain
09:30 AM Peter Grünwald - A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity (Talk) Peter Grünwald
11:00 AM Jean-Michel Marin - Some recent advances on Approximate Bayesian Computation techniques (Talk) Jean-Michel Marin
11:45 AM Contributed talk 1 - A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks (Talk) Behnam Neyshabur
02:00 PM Olivier Catoni - Dimension-free PAC-Bayesian Bounds (Talk) Olivier Catoni
02:40 PM Contributed talk 2 - Dimension free PAC-Bayesian bounds for the estimation of the mean of a random vector (Talk) Olivier Catoni
03:30 PM Yevgeny Seldin - A Strongly Quasiconvex PAC-Bayesian Bound (Talk) Yevgeny Seldin
04:15 PM John Shawe-Taylor - Distribution Dependent Priors for Stable Learning (Talk) John Shawe-Taylor
05:00 PM Daniel Roy - Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes (Talk) Dan Roy
05:30 PM Neil Lawrence, Francis Bach and François Laviolette (Discussion) Neil Lawrence, Francis Bach, Francois Laviolette
06:25 PM Concluding remarks <span> <a href="#"></a> </span> Francis Bach, Benjamin Guedj, Pascal Germain

Author Information

Benjamin Guedj (Inria & University College London)

Benjamin Guedj is a tenured research scientist at Inria since 2014, member of the MODAL project-team (MOdels for Data Analysis and Learning) of the Lille - Nord Europe research centre in France. He is also affiliated with the mathematics department of the University of Lille. He obtained his Ph.D. in mathematics in 2013 from UPMC (Université Pierre & Marie Curie, France) under the supervision of Gérard Biau and Éric Moulines. Prior to that, he was a research assistant at DTU Compute (Denmark). His main line of research is in statistical machine learning, both from theoretical and algorithmic perspectives. He is primarily interested in the design, analysis and implementation of statistical machine learning methods for high dimensional problems, mainly using the PAC-Bayesian theory.

Pascal Germain (INRIA Paris)
Francis Bach (Inria)

Francis Bach is a researcher at INRIA, leading since 2011 the SIERRA project-team, which is part of the Computer Science Department at Ecole Normale Supérieure in Paris, France. After completing his Ph.D. in Computer Science at U.C. Berkeley, he spent two years at Ecole des Mines, and joined INRIA and Ecole Normale Supérieure in 2007. He is interested in statistical machine learning, and especially in convex optimization, combinatorial optimization, sparse methods, kernel-based learning, vision and signal processing. He gave numerous courses on optimization in the last few years in summer schools. He has been program co-chair for the International Conference on Machine Learning in 2015.

More from the Same Authors