Timezone: »
Poster
Thompson Sampling and Approximate Inference
My Phan · Yasin Abbasi Yadkori · Justin Domke
Thu Dec 12 10:45 AM -- 12:45 PM (PST) @ East Exhibition Hall B + C #45
We study the effects of approximate inference on the performance of Thompson sampling in the $k$-armed bandit problems. Thompson sampling is a successful algorithm for online decision-making but requires posterior inference, which often must be approximated in practice. We show that even small constant inference error (in $\alpha$-divergence) can lead to poor performance (linear regret) due to under-exploration (for $\alpha<1$) or over-exploration (for $\alpha>0$) by the approximation. While for $\alpha > 0$ this is unavoidable, for $\alpha \leq 0$ the regret can be improved by adding a small amount of forced exploration even when the inference error is a large constant.
Author Information
My Phan (University of Massachusetts Amherst)
Yasin Abbasi Yadkori (VinAI Research/ VinTech JSC.,)
Justin Domke (University of Massachusetts, Amherst)
More from the Same Authors
-
2021 Poster: MCMC Variational Inference via Uncorrected Hamiltonian Annealing »
Tomas Geffner · Justin Domke -
2021 Poster: Amortized Variational Inference for Simple Hierarchical Models »
Abhinav Agrawal · Justin Domke -
2020 Poster: Advances in Black-Box VI: Normalizing Flows, Importance Weighting, and Optimization »
Abhinav Agrawal · Daniel Sheldon · Justin Domke -
2020 Poster: Model Selection in Contextual Stochastic Bandit Problems »
Aldo Pacchiano · My Phan · Yasin Abbasi Yadkori · Anup Rao · Julian Zimmert · Tor Lattimore · Csaba Szepesvari -
2020 Poster: Approximation Based Variance Reduction for Reparameterization Gradients »
Tomas Geffner · Justin Domke -
2019 Poster: Bootstrapping Upper Confidence Bound »
Botao Hao · Yasin Abbasi Yadkori · Zheng Wen · Guang Cheng -
2019 Poster: Provable Gradient Variance Guarantees for Black-Box Variational Inference »
Justin Domke -
2019 Poster: Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation »
Justin Domke · Daniel Sheldon -
2019 Spotlight: Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation »
Justin Domke · Daniel Sheldon -
2018 Poster: Using Large Ensembles of Control Variates for Variational Inference »
Tomas Geffner · Justin Domke -
2018 Poster: Scalar Posterior Sampling with Applications »
Georgios Theocharous · Zheng Wen · Yasin Abbasi Yadkori · Nikos Vlassis -
2018 Poster: Importance Weighting and Variational Inference »
Justin Domke · Daniel Sheldon -
2017 Poster: Near Minimax Optimal Players for the Finite-Time 3-Expert Prediction Problem »
Yasin Abbasi Yadkori · Peter Bartlett · Victor Gabillon -
2017 Poster: Conservative Contextual Linear Bandits »
Abbas Kazerouni · Mohammad Ghavamzadeh · Yasin Abbasi · Benjamin Van Roy -
2015 Workshop: Machine Learning From and For Adaptive User Technologies: From Active Learning & Experimentation to Optimization & Personalization »
Joseph Jay Williams · Yasin Abbasi Yadkori · Finale Doshi-Velez -
2015 Poster: Minimax Time Series Prediction »
Wouter Koolen · Alan Malek · Peter Bartlett · Yasin Abbasi Yadkori -
2014 Workshop: Large-scale reinforcement learning and Markov decision problems »
Benjamin Van Roy · Mohammad Ghavamzadeh · Peter Bartlett · Yasin Abbasi Yadkori · Ambuj Tewari -
2013 Workshop: Resource-Efficient Machine Learning »
Yevgeny Seldin · Yasin Abbasi Yadkori · Yacov Crammer · Ralf Herbrich · Peter Bartlett -
2013 Poster: Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions »
Yasin Abbasi Yadkori · Peter Bartlett · Varun Kanade · Yevgeny Seldin · Csaba Szepesvari -
2011 Poster: Improved Algorithms for Linear Stochastic Bandits »
Yasin Abbasi Yadkori · David Pal · Csaba Szepesvari -
2011 Spotlight: Improved Algorithms for Linear Stochastic Bandits »
Yasin Abbasi Yadkori · David Pal · Csaba Szepesvari