Optimal kernel choice for large-scale two-sample tests
Arthur Gretton · Bharath Sriperumbudur · Dino Sejdinovic · Heiko Strathmann · Sivaraman Balakrishnan · Massimiliano Pontil · Kenji Fukumizu

Mon Dec 3rd 07:00 PM -- 12:00 AM @ Harrah’s Special Events Center 2nd Floor #None
Abstract Given samples from distributions $p$ and $q$, a two-sample test determines whether to reject the null hypothesis that $p=q$, based on the value of a test statistic measuring the distance between the samples. One choice of test statistic is the maximum mean discrepancy (MMD), which is a distance between embeddings of the probability distributions in a reproducing kernel Hilbert space. The kernel used in obtaining these embeddings is thus critical in ensuring the test has high power, and correctly distinguishes unlike distributions with high probability. A means of parameter selection for the two-sample test based on the MMD is proposed. For a given test level (an upper bound on the probability of making a Type I error), the kernel is chosen so as to maximize the test power, and minimize the probability of making a Type II error. The test statistic, test threshold, and optimization over the kernel parameters are obtained with cost linear in the sample size. These properties make the kernel selection and test procedures suited to data streams, where the observations cannot all be stored in memory. In experiments, the new kernel selection approach yields a more powerful test than earlier kernel selection heuristics.

Author Information

Arthur Gretton (Gatsby Unit, UCL)

Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit at UCL. He received degrees in Physics and Systems Engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He previously worked at the MPI for Biological Cybernetics, and at the Machine Learning Department, Carnegie Mellon University. Arthur's recent research interests in machine learning include the design and training of generative models, both implicit (e.g. GANs) and explicit (high/infinite dimensional exponential family models), nonparametric hypothesis testing, and kernel methods. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, an Area Chair for NeurIPS in 2008 and 2009, a Senior Area Chair for NeurIPS in 2018, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013. Arthur was program chair for AISTATS in 2016 (with Christian Robert), tutorials chair for ICML 2018 (with Ruslan Salakhutdinov), workshops chair for ICML 2019 (with Honglak Lee), program chair for the Dali workshop in 2019 (with Krikamol Muandet and Shakir Mohammed), and co-organsier of the Machine Learning Summer School 2019 in London (with Marc Deisenroth).

Bharath Sriperumbudur (The Pennsylvania State University)
Dino Sejdinovic (University of Oxford)
Heiko Strathmann (UCL)
Sivaraman Balakrishnan (CMU)
Massimiliano Pontil (IIT & UCL)
Kenji Fukumizu (Institute of Statistical Mathematics / Preferred Networks / RIKEN AIP)

More from the Same Authors