Timezone: »
Contextual policies are used in many settings to customize system parameters and actions to the specifics of a particular setting. In some real-world settings, such as randomized controlled trials or A/B tests, it may not be possible to measure policy outcomes at the level of context—we observe only aggregate rewards across a distribution of contexts. This makes policy optimization much more difficult because we must solve a high-dimensional optimization problem over the entire space of contextual policies, for which existing optimization methods are not suitable. We develop effective models that leverage the structure of the search space to enable contextual policy optimization directly from the aggregate rewards using Bayesian optimization. We use a collection of simulation studies to characterize the performance and robustness of the models, and show that our approach of inferring a low-dimensional context embedding performs best. Finally, we show successful contextual policy optimization in a real-world video bitrate policy problem.
Author Information
Qing Feng (Facebook)
Ben Letham (Facebook)
Hongzi Mao (MIT)
Eytan Bakshy (Facebook)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Poster: High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization »
Wed Dec 9th 05:00 -- 07:00 AM Room Poster Session 2
More from the Same Authors
-
2020 Poster: Differentiable Expected Hypervolume Improvement for Parallel Multi-Objective Bayesian Optimization »
Samuel Daulton · Maximilian Balandat · Eytan Bakshy -
2020 Poster: BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization »
Maximilian Balandat · Brian Karrer · Daniel Jiang · Samuel Daulton · Ben Letham · Andrew Wilson · Eytan Bakshy -
2020 Poster: Re-Examining Linear Embeddings for High-Dimensional Bayesian Optimization »
Ben Letham · Roberto Calandra · Akshara Rai · Eytan Bakshy -
2019 Poster: Park: An Open Platform for Learning-Augmented Computer Systems »
Hongzi Mao · Parimarjan Negi · Akshay Narayan · Hanrui Wang · Jiacheng Yang · Haonan Wang · Ryan Marcus · Ravichandra Addanki · Mehrdad Khani Shirkoohi · Songtao He · Vikram Nathan · Frank Cangialosi · Shaileshh Venkatakrishnan · Wei-Hung Weng · Song Han · Tim Kraska · Dr.Mohammad Alizadeh -
2019 Poster: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning »
Ravichandra Addanki · Shaileshh Bojja Venkatakrishnan · Shreyan Gupta · Hongzi Mao · Mohammad Alizadeh