NeurIPS Poster Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Poster

Kernel-Based Tests for Likelihood-Free Hypothesis Testing

Patrik Robert Gerber · Tianze Jiang · Yury Polyanskiy · Rui Sun

Great Hall & Hall B1+B2 (level 1) #1008

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: Given

n

$n$ observations from two balanced classes, consider the task of labeling an additional

m

$m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with completeknowledge of class distributions (

n = \infty

$n=\infty$ ) theproblem is solved optimally by the likelihood-ratio test; when

m = 1

$m=1$ it corresponds to binary classification; and when

m \approx n

$m\approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-offbetween

m

$m$ and

n

$n$ : increasing the data sample

m

$m$ reduces the amount

n

$n$ of training/simulationdata needed. In this work we (a) introduce a generalization where unlabeled samples come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum meandiscrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detectionof the Higgs boson and detection of planted DDPM generated images amidstCIFAR-10 images. For both problems we confirm the existence of the theoretically predicted asymmetric

m

$m$ vs

n

$n$ trade-off.

Chat is not available.