Timezone: »

Poster
On Margins and Generalisation for Voting Classifiers
Felix Biggs · Valentina Zantedeschi · Benjamin Guedj

Wed Nov 30 09:00 AM -- 11:00 AM (PST) @ Hall J #935

We study the generalisation properties of majority voting on finite ensembles of classifiers, proving margin-based generalisation bounds via the PAC-Bayes theory. These provide state-of-the-art guarantees on a number of classification tasks. Our central results leverage the Dirichlet posteriors studied recently by Zantedeschi et al. (2021) for training voting classifiers; in contrast to that work our bounds apply to non-randomised votes via the use of margins. Our contributions add perspective to the debate on the margins theory'' proposed by Schapire et al. (1998) for the generalisation of ensemble classifiers.

#### Author Information

##### Felix Biggs (University College London)

PhD student with Benjamin Guedj, focusing on PAC-Bayes and its application to neural networks.

##### Benjamin Guedj (Inria &amp; University College London)

Benjamin Guedj is a tenured research scientist at Inria since 2014, affiliated to the Lille - Nord Europe research centre in France. He is also affiliated with the mathematics department of the University of Lille. Since 2018, he is a Principal Research Fellow at the Centre for Artificial Intelligence and Department of Computer Science at University College London. He is also a visiting researcher at The Alan Turing Institute. Since 2020, he is the founder and scientific director of The Inria London Programme, a strategic partnership between Inria and UCL as part of a France-UK scientific initiative. He obtained his Ph.D. in mathematics in 2013 from UPMC (Université Pierre & Marie Curie, France) under the supervision of Gérard Biau and Éric Moulines. Prior to that, he was a research assistant at DTU Compute (Denmark). His main line of research is in statistical machine learning, both from theoretical and algorithmic perspectives. He is primarily interested in the design, analysis and implementation of statistical machine learning methods for high dimensional problems, mainly using the PAC-Bayesian theory.