Timezone: »

 
Poster
PAC-Bayes Un-Expected Bernstein Inequality
Zakaria Mhammedi · Peter Grünwald · Benjamin Guedj

Thu Dec 12 05:00 PM -- 07:00 PM (PST) @ East Exhibition Hall B + C #234
We present a new PAC-Bayesian generalization bound. Standard bounds contain a $\sqrt{L_n \cdot \KL/n}$ complexity term which dominates unless $L_n$, the empirical error of the learning algorithm's randomized predictions, vanishes. We manage to replace $L_n$ by a term which vanishes in many more situations, essentially whenever the employed learning algorithm is sufficiently stable on the dataset at hand. Our new bound consistently beats state-of-the-art bounds both on a toy example and on UCI datasets (with large enough $n$). Theoretically, unlike existing bounds, our new bound can be expected to converge to $0$ faster whenever a Bernstein/Tsybakov condition holds, thus connecting PAC-Bayesian generalization and {\em excess risk\/} bounds---for the latter it has long been known that faster convergence can be obtained under Bernstein conditions. Our main technical tool is a new concentration inequality which is like Bernstein's but with $X^2$ taken outside its expectation.

Author Information

Zakaria Mhammedi (The Australian National University)
Peter Grünwald (CWI and Leiden University)
Benjamin Guedj (Inria & University College London)

Benjamin Guedj is a tenured research scientist at Inria since 2014, member of the MODAL project-team (MOdels for Data Analysis and Learning) of the Lille - Nord Europe research centre in France. He is also affiliated with the mathematics department of the University of Lille. He obtained his Ph.D. in mathematics in 2013 from UPMC (Université Pierre & Marie Curie, France) under the supervision of Gérard Biau and Éric Moulines. Prior to that, he was a research assistant at DTU Compute (Denmark). His main line of research is in statistical machine learning, both from theoretical and algorithmic perspectives. He is primarily interested in the design, analysis and implementation of statistical machine learning methods for high dimensional problems, mainly using the PAC-Bayesian theory.

More from the Same Authors