Timezone: »

Latent Bandits Revisited
Joey Hong · Branislav Kveton · Manzil Zaheer · Yinlam Chow · Amr Ahmed · Craig Boutilier

Tue Dec 08 09:00 PM -- 11:00 PM (PST) @ Poster Session 2 #589

A latent bandit is a bandit problem where the learning agent knows reward distributions of arms conditioned on an unknown discrete latent state. The goal of the agent is to identify the latent state, after which it can act optimally. This setting is a natural midpoint between online and offline learning, where complex models can be learned offline and the agent identifies the latent state online. This is of high practical relevance, for instance in recommender systems. In this work, we propose general algorithms for latent bandits, based on both upper confidence bounds and Thompson sampling. The algorithms are contextual, and aware of model uncertainty and misspecification. We provide a unified theoretical analysis of our algorithms, which have lower regret than classic bandit policies when the number of latent states is smaller than actions. A comprehensive empirical study showcases the advantages of our approach.

Author Information

Joey Hong (Google AI)
Branislav Kveton (Google Research)
Manzil Zaheer (Google)
Yinlam Chow (Google Research)
Amr Ahmed (Google Research)

Amr Ahmed is a Senior Staff Research Scientist at Google. He received his M.Sc and PhD degrees from the School of Computer Science, Carnegie Mellon University in 2009 and 2011, respectively. He received the best paper award at KDD 2014 , the best Paper Award at WSDM 2014, the 2012 ACM SIGKDD Doctoral Dissertation Award, and a best paper award (runner-up) at WSDM 2012. He co-chaired the WWW'18 track on Web Content Analysis and served as an Area Chair for IJCAI 2019, SIGIR 2019, SIGIR 2018, ICML 2018, ICML 2017, KDD 2016, WSDM 2015, ICML 2014, and ICDM 2014. His research interests include large-scale machine learning, data/web mining, user modeling, personalization, social networks and content analysis.

Craig Boutilier (Google)

More from the Same Authors