Kernel Stein Tests for Multiple Model Comparison
Jen Ning Lim · Makoto Yamada · Bernhard Schölkopf · Wittawat Jitkrittum

Tue Dec 10th 10:45 AM -- 12:45 PM @ East Exhibition Hall B + C #60
We address the problem of non-parametric multiple model comparison: given $l$ candidate models, decide whether each candidate is as good as the best one(s) or worse than it. We propose two statistical tests, each controlling a different notion of decision errors. The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declared worse (false positive rate). The second test is based on multiple correction, and controls the proportion of the models declared worse but are in fact as good as the best (false discovery rate). We prove that under appropriate conditions the first test can yield a higher true positive rate than the second. Experimental results on toy and real (CelebA, Chicago Crime data) problems show that the two tests have high true positive rates with well-controlled error rates. By contrast, the naive approach of choosing the model with the lowest score without correction leads to more false positives.

Author Information

Jen Ning Lim (University College London)
Makoto Yamada (Kyoto University / RIKEN AIP)
Bernhard Schölkopf (MPI for Intelligent Systems)

Bernhard Scholkopf received degrees in mathematics (London) and physics (Tubingen), and a doctorate in computer science from the Technical University Berlin. He has researched at AT&T Bell Labs, at GMD FIRST, Berlin, at the Australian National University, Canberra, and at Microsoft Research Cambridge (UK). In 2001, he was appointed scientific member of the Max Planck Society and director at the MPI for Biological Cybernetics; in 2010 he founded the Max Planck Institute for Intelligent Systems. For further information, see

Wittawat Jitkrittum (Max Planck Institute for Intelligent Systems)

Wittawat Jitkrittum is a postdoctoral researcher at Max Planck Institute for Intelligent Systems, Germany. He earned his PhD from Gatsby Unit, University College London with a thesis on informative features for comparing distributions. He received a best paper award at NeurIPS 2017 and the ELLIS PhD award 2019 for outstanding dissertation. Wittawat has broad research interests covering kernel methods, deep generative models, and approximate Bayesian inference. He served as a publication chair for AISTATS 2016, a program committee for NeurIPS, ICML, AISTATS, among others, and is a co-organizer of the first Southeast Asia Machine Learning School (SEAMLS 2019) in Indonesia and a co-organizer of the first Machine Learning Research School (MLRS 2019) in Thailand.

More from the Same Authors