Timezone: »

Daniel Roy - Deep Neural Networks: From Flat Minima to Numerically Nonvacuous Generalization Bounds via PAC-Bayes
Daniel Roy
Event URL: http://danroy.org »

One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining these phenomena in terms of implicit regularization, structural properties of the solution, and/or easiness of the data is that many learning bounds are quantitatively vacuous when applied to networks learned by SGD in this "deep learning" regime. Logically, in order to explain generalization, we need nonvacuous bounds.

I will discuss recent work using PAC-Bayesian bounds and optimization to arrive at nonvacuous generalization bounds for neural networks with millions of parameters trained on only tens of thousands of examples. We connect our findings to recent and old work on flat minima and MDL-based explanations of generalization, as well as to variational inference for deep learning. Time permitting, I'll discuss new work interpreting Entropy-SGD as a PAC-Bayesian method.

Joint work with Gintare Karolina Dziugaite, based on https://arxiv.org/abs/1703.11008

Author Information

Daniel Roy (Univ of Toronto & Vector)

More from the Same Authors