Timezone: »

On Divergence Measures for Bayesian Pseudocoresets
Balhae Kim · Jungwon Choi · Seanie Lee · Yoonho Lee · Jung-Woo Ha · Juho Lee

Tue Nov 29 02:00 PM -- 04:00 PM (PST) @ Hall J #710

A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset. While promising, the scalability of Bayesian pseudocoresets is not yet validated in large-scale problems such as image classification with deep neural networks. On the other hand, dataset distillation methods similarly construct a small dataset such that the optimization with the synthetic dataset converges to a solution similar to optimization with full data. Although dataset distillation has been empirically verified in large-scale settings, the framework is restricted to point estimates, and their adaptation to Bayesian inference has not been explored. This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets by minimizing specific divergence measures: reverse KL divergence and Wasserstein distance. Furthermore, we provide a unifying view of such divergence measures in Bayesian pseudocoreset construction. Finally, we propose a novel Bayesian pseudocoreset algorithm based on minimizing forward KL divergence. Our empirical results demonstrate that the pseudocoresets constructed from these methods reflect the true posterior even in large-scale Bayesian inference problems.

Author Information

Balhae Kim (Korea Advanced Institute of Science & Technology)
Jungwon Choi (KAIST)
Seanie Lee (Korea Advanced Institute of Science & Technology)
Yoonho Lee (Stanford University)
Jung-Woo Ha (NAVER CLOVA AI Lab)
Jung-Woo Ha

- Head, AI Innovation, NAVER Cloud - Research Fellow, NAVER AI Lab - Datasets and Benchmarks Co-Chair, NeurIPS 2023 - Socials Co-Chair, ICML 2023 - Socials Co-Chair, NeurIPS 2022 - BS, Seoul National University - PhD, Seoul National University


More from the Same Authors