Timezone: »
Optimization is a key component for training machine learning models and has a strong impact on their generalization. In this paper, we consider a particular optimization method---the stochastic gradient Langevin dynamics (SGLD) algorithm---and investigate the generalization of models trained by SGLD. We derive a new generalization bound by connecting SGLD with Gaussian channels found in information and communication theory. Our bound can be computed from the training data and incorporates the variance of gradients for quantifying a particular kind of "sharpness" of the loss landscape. We also consider a closely related algorithm with SGLD, namely differentially private SGD (DP-SGD). We prove that the generalization capability of DP-SGD can be amplified by iteration. Specifically, our bound can be sharpened by including a time-decaying factor if the DP-SGD algorithm outputs the last iterate while keeping other iterates hidden. This decay factor enables the contribution of early iterations to our bound to reduce with time and is established by strong data processing inequalities---a fundamental tool in information theory. We demonstrate our bound through numerical experiments, showing that it can predict the behavior of the true generalization gap.
Author Information
Hao Wang (Harvard University)
Yizhe Huang (University of Texas, Austin)
Rui Gao (University of Texas at Austin)
Flavio Calmon (Harvard University)
More from the Same Authors
-
2021 : Who Gets the Benefit of the Doubt? Racial Bias in Machine Learning Algorithms Applied to Secondary School Math Education »
Haewon Jeong · Michael D. Wu · Nilanjana Dasgupta · Muriel Medard · Flavio Calmon -
2022 Panel: Panel 1C-7: Beyond Adult and… & Uncalibrated Models Can… »
Kailas Vodrahalli · Flavio Calmon -
2022 Poster: Distributionally robust weighted k-nearest neighbors »
Shixiang Zhu · Liyan Xie · Minghe Zhang · Rui Gao · Yao Xie -
2022 Poster: Rashomon Capacity: A Metric for Predictive Multiplicity in Classification »
Hsiang Hsu · Flavio Calmon -
2022 Poster: Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection »
Wael Alghamdi · Hsiang Hsu · Haewon Jeong · Hao Wang · Peter Michalak · Shahab Asoodeh · Flavio Calmon -
2022 Poster: On the Epistemic Limits of Personalized Prediction »
Lucas Monteiro Paes · Carol Long · Berk Ustun · Flavio Calmon -
2021 Poster: Generalization Bounds for (Wasserstein) Robust Optimization »
Yang An · Rui Gao -
2021 Poster: Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators »
Qitian Wu · Rui Gao · Hongyuan Zha -
2018 : Posters 1 »
Wei Wei · Flavio Calmon · Travis Dick · Leilani Gilpin · Maroussia Lévesque · Malek Ben Salem · Michael Wang · Jack Fitzsimons · Dimitri Semenovich · Linda Gu · Nathaniel Fruchter -
2018 Poster: Robust Hypothesis Testing Using Wasserstein Uncertainty Sets »
Rui Gao · Liyan Xie · Yao Xie · Huan Xu -
2018 Spotlight: Robust Hypothesis Testing Using Wasserstein Uncertainty Sets »
Rui Gao · Liyan Xie · Yao Xie · Huan Xu -
2017 Poster: Optimized Pre-Processing for Discrimination Prevention »
Flavio Calmon · Dennis Wei · Bhanukiran Vinzamuri · Karthikeyan Natesan Ramamurthy · Kush Varshney