Timezone: »
Poster
Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
Orestis Papadigenopoulos · Constantine Caramanis · Sanjay Shakkottai
The stochastic multi-armed bandit setting has been recently studied in the non-stationary regime, where the mean payoff of each action is a non-decreasing function of the number of rounds passed since it was last played. This model captures natural behavioral aspects of the users which crucially determine the performance of recommendation platforms, ad placement systems, and more. Even assuming prior knowledge of the mean payoff functions, computing an optimal planning in the above model is NP-hard, while the state-of-the-art is a $1/4$-approximation algorithm for the case where at most one arm can be played per round. We first focus on the setting where the mean payoff functions are known. In this setting, we significantly improve the best-known guarantees for the planning problem by developing a polynomial-time $(1-{1}/{e})$-approximation algorithm (asymptotically and in expectation), based on a novel combination of randomized LP rounding and a time-correlated (interleaved) scheduling method. Furthermore, our algorithm achieves improved guarantees -- compared to prior work -- for the case where more than one arms can be played at each round. Moving to the bandit setting, when the mean payoff functions are initially unknown, we show how our algorithm can be transformed into a bandit algorithm with sublinear regret.
Author Information
Orestis Papadigenopoulos (Columbia University)
Constantine Caramanis (UT Austin)
Sanjay Shakkottai (University of Texas at Austin)
More from the Same Authors
-
2021 Spotlight: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 : Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2022 : Learning Certifiably Robust Controllers Using Fragile Perception »
Dawei Sun · Negin Musavi · Geir Dullerud · Sanjay Shakkottai · Sayan Mitra -
2022 : Learning Certifiably Robust Controllers Using Fragile Perception »
Dawei Sun · Negin Musavi · Geir Dullerud · Sanjay Shakkottai · Sayan Mitra -
2022 Poster: Tractable Optimality in Episodic Latent MABs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2022 Poster: Minimax Regret for Cascading Bandits »
Daniel Vial · Sujay Sanghavi · Sanjay Shakkottai · R. Srikant -
2022 Poster: FedAvg with Fine Tuning: Local Updates Lead to Representation Learning »
Liam Collins · Hamed Hassani · Aryan Mokhtari · Sanjay Shakkottai -
2021 Poster: Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators »
Zaiwei Chen · Siva Theja Maguluri · Sanjay Shakkottai · Karthikeyan Shanmugam -
2021 Poster: RL for Latent MDPs: Regret Guarantees and a Lower Bound »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2021 Poster: Recurrent Submodular Welfare and Matroid Blocking Semi-Bandits »
Orestis Papadigenopoulos · Constantine Caramanis -
2021 Poster: Reinforcement Learning in Reward-Mixing MDPs »
Jeongyeol Kwon · Yonathan Efroni · Constantine Caramanis · Shie Mannor -
2020 Poster: Task-Robust Model-Agnostic Meta-Learning »
Liam Collins · Aryan Mokhtari · Sanjay Shakkottai -
2020 Poster: Second Order Optimality in Decentralized Non-Convex Optimization via Perturbed Gradient Tracking »
Isidoros Tziotis · Constantine Caramanis · Aryan Mokhtari -
2020 Poster: Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions »
Matthew Faw · Rajat Sen · Karthikeyan Shanmugam · Constantine Caramanis · Sanjay Shakkottai -
2020 Poster: Applications of Common Entropy for Causal Inference »
Murat Kocaoglu · Sanjay Shakkottai · Alex Dimakis · Constantine Caramanis · Sriram Vishwanath -
2020 Poster: Finite-Sample Analysis of Contractive Stochastic Approximation Using Smooth Convex Envelopes »
Zaiwei Chen · Siva Theja Maguluri · Sanjay Shakkottai · Karthikeyan Shanmugam -
2020 Poster: Robust compressed sensing using generative models »
Ajil Jalal · Liu Liu · Alex Dimakis · Constantine Caramanis -
2019 Poster: Primal-Dual Block Generalized Frank-Wolfe »
Qi Lei · JIACHENG ZHUO · Constantine Caramanis · Inderjit Dhillon · Alex Dimakis -
2019 Poster: Blocking Bandits »
Soumya Basu · Rajat Sen · Sujay Sanghavi · Sanjay Shakkottai -
2017 Poster: Model-Powered Conditional Independence Test »
Rajat Sen · Ananda Theertha Suresh · Karthikeyan Shanmugam · Alex Dimakis · Sanjay Shakkottai -
2016 Poster: Fast Algorithms for Robust PCA via Gradient Descent »
Xinyang Yi · Dohyung Park · Yudong Chen · Constantine Caramanis -
2016 Poster: More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning »
Xinyang Yi · Zhaoran Wang · Zhuoran Yang · Constantine Caramanis · Han Liu -
2016 Poster: Regret of Queueing Bandits »
Subhashini Krishnasamy · Rajat Sen · Ramesh Johari · Sanjay Shakkottai -
2015 Poster: Optimal Linear Estimation under Unknown Nonlinear Transform »
Xinyang Yi · Zhaoran Wang · Constantine Caramanis · Han Liu -
2015 Poster: Regularized EM Algorithms: A Unified Framework and Statistical Guarantees »
Xinyang Yi · Constantine Caramanis -
2014 Poster: Greedy Subspace Clustering »
Dohyung Park · Constantine Caramanis · Sujay Sanghavi -
2013 Poster: Memory Limited, Streaming PCA »
Ioannis Mitliagkas · Constantine Caramanis · Prateek Jain