Timezone: »
Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their mixing rate and gradient variance. To do this, we demonstrate that these methods—which we collectively refer to as Markov chain score ascent (MCSA) methods—can be cast as special cases of the Markov chain gradient descent framework. Furthermore, by leveraging this new understanding, we develop a novel MCSA scheme, parallel MCSA (pMCSA), that achieves a tighter bound on the gradient variance. We demonstrate that this improved theoretical result translates to superior empirical performance.
Author Information
Kyurae Kim (University of Pennsylvania)
I am a Ph.D. student advised by Professor Jacob R. Gardner at the University of Pennsylvania working on Bayesian machine learning, Bayesian inference, and Bayesian optimization. I acquired my Bachelor in Engineering degree at Sogang University, South Korea, and previously worked at Samsung Medical Center, South Korea, as an undergraduate researcher, at Kangbuk Samsung Hospital, South Korea, as a visiting researcher, and at the University of Liverpool as a research associate. I also worked part-time as an embedded software engineer at Hansono, South Korea. I hold memberships in both the ACM and the IEEE.
Jisu Oh (Sogang University)
Jacob Gardner (University of Pennsylvania)
Adji Bousso Dieng (Princeton University & Google AI)
Hongseok Kim (Sogang University)
More from the Same Authors
-
2022 : Efficient Variational Gaussian Processes Initialization via Kernel-based Least Squares Fitting »
Xinran Zhu · David Bindel · Jacob Gardner -
2022 Workshop: Learning Meaningful Representations of Life »
Elizabeth Wood · Adji Bousso Dieng · Aleksandrina Goeva · Alex X Lu · Anshul Kundaje · Chang Liu · Debora Marks · Ed Boyden · Eli N Weinstein · Lorin Crawford · Mor Nitzan · Rebecca Boiarsky · Romain Lopez · Tamara Broderick · Ray Jones · Wouter Boomsma · Yixin Wang · Stephen Ra -
2022 : Q & A »
Jacob Gardner · Virginia Aglietti · Janardhan Rao Doppa -
2022 Tutorial: Advances in Bayesian Optimization »
Janardhan Rao Doppa · Virginia Aglietti · Jacob Gardner -
2022 : Tutorial part 1 »
Jacob Gardner · Virginia Aglietti · Janardhan Rao Doppa -
2022 Workshop: Machine Learning and the Physical Sciences »
Atilim Gunes Baydin · Adji Bousso Dieng · Emine Kucukbenli · Gilles Louppe · Siddharth Mishra-Sharma · Benjamin Nachman · Brian Nord · Savannah Thais · Anima Anandkumar · Kyle Cranmer · Lenka Zdeborová · Rianne van den Berg -
2022 : Panel Discussion »
Jacob Gardner · Marta Blangiardo · Viacheslav Borovitskiy · Jasper Snoek · Paula Moraga · Carolina Osorio -
2022 Poster: Local Bayesian optimization via maximizing probability of descent »
Quan Nguyen · Kaiwen Wu · Jacob Gardner · Roman Garnett -
2022 Poster: Local Latent Space Bayesian Optimization over Structured Inputs »
Natalie Maus · Haydn Jones · Juston Moore · Matt Kusner · John Bradshaw · Jacob Gardner -
2021 Workshop: Learning Meaningful Representations of Life (LMRL) »
Elizabeth Wood · Adji Bousso Dieng · Aleksandrina Goeva · Anshul Kundaje · Barbara Engelhardt · Chang Liu · David Van Valen · Debora Marks · Edward Boyden · Eli N Weinstein · Lorin Crawford · Mor Nitzan · Romain Lopez · Tamara Broderick · Ray Jones · Wouter Boomsma · Yixin Wang -
2021 Poster: Consistency Regularization for Variational Auto-Encoders »
Samarth Sinha · Adji Bousso Dieng -
2021 Poster: Scaling Gaussian Processes with Derivative Information Using Variational Inference »
Misha Padidar · Xinran Zhu · Leo Huang · Jacob Gardner · David Bindel -
2020 Workshop: Machine Learning and the Physical Sciences »
Anima Anandkumar · Kyle Cranmer · Shirley Ho · Mr. Prabhat · Lenka Zdeborová · Atilim Gunes Baydin · Juan Carrasquilla · Adji Bousso Dieng · Karthik Kashinath · Gilles Louppe · Brian Nord · Michela Paganini · Savannah Thais -
2020 Workshop: Learning Meaningful Representations of Life (LMRL.org) »
Elizabeth Wood · Debora Marks · Ray Jones · Adji Bousso Dieng · Alan Aspuru-Guzik · Anshul Kundaje · Barbara Engelhardt · Chang Liu · Edward Boyden · Kresten Lindorff-Larsen · Mor Nitzan · Smita Krishnaswamy · Wouter Boomsma · Yixin Wang · David Van Valen · Orr Ashenberg -
2020 Poster: Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization »
Geoff Pleiss · Martin Jankowiak · David Eriksson · Anil Damle · Jacob Gardner -
2020 Poster: Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees »
Shali Jiang · Daniel Jiang · Maximilian Balandat · Brian Karrer · Jacob Gardner · Roman Garnett -
2019 : Surya Ganguli, Yasaman Bahri, Florent Krzakala moderated by Lenka Zdeborova »
Florent Krzakala · Yasaman Bahri · Surya Ganguli · Lenka Zdeborová · Adji Bousso Dieng · Joan Bruna -
2017 Poster: Variational Inference via $\chi$ Upper Bound Minimization »
Adji Bousso Dieng · Dustin Tran · Rajesh Ranganath · John Paisley · David Blei