NeurIPS 2021 Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics Oral

Oral

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Carles Domingo i Enrich · Youssef Mroueh

[ Abstract ] [ Visit Oral Session 1: Theory ]

Abstract: Several works in implicit and explicit generative modeling empirically observed that feature-learning discriminators outperform fixed-kernel discriminators in terms of the sample quality of the models. We provide separation results between probability metrics with fixed-kernel and feature-learning discriminators using the function classes

$\mathcal{F}_2$ and

$\mathcal{F}_1$ respectively, which were developed to study overparametrized two-layer neural networks. In particular, we construct pairs of distributions over hyper-spheres that can not be discriminated by fixed kernel

$(\mathcal{F}_2)$ integral probability metric (IPM) and Stein discrepancy (SD) in high dimensions, but that can be discriminated by their feature learning (

$\mathcal{F}_1$ ) counterparts. To further study the separation we provide links between the

$\mathcal{F}_1$ and

$\mathcal{F}_2$ IPMs with sliced Wasserstein distances. Our work suggests that fixed-kernel discriminators perform worse than their feature learning counterparts because their corresponding metrics are weaker.

Chat is not available.