Timezone: »
The Multi-Armed Bandits (MAB) framework highlights the trade-off between acquiring new knowledge (Exploration) and leveraging available knowledge (Exploitation). In the classical MAB problem, a decision maker must choose an arm at each time step, upon which she receives a reward. The decision maker's objective is to maximize her cumulative expected reward over the time horizon. The MAB problem has been studied extensively, specifically under the assumption of the arms' rewards distributions being stationary, or quasi-stationary, over time. We consider a variant of the MAB framework, which we termed Rotting Bandits, where each arm's expected reward decays as a function of the number of times it has been pulled. We are motivated by many real-world scenarios such as online advertising, content recommendation, crowdsourcing, and more. We present algorithms, accompanied by simulations, and derive theoretical guarantees.
Author Information
Nir Levine (DeepMind)
Yacov Crammer (Technion)
Shie Mannor (Technion)
More from the Same Authors
-
2019 Workshop: Safety and Robustness in Decision-making »
Mohammad Ghavamzadeh · Shie Mannor · Yisong Yue · Marek Petrik · Yinlam Chow -
2019 Poster: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2019 Spotlight: Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies »
Yonathan Efroni · Nadav Merlis · Mohammad Ghavamzadeh · Shie Mannor -
2018 Poster: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2018 Spotlight: Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning »
Yonathan Efroni · Gal Dalal · Bruno Scherrer · Shie Mannor -
2018 Poster: Efficient Loss-Based Decoding on Graphs for Extreme Classification »
Itay Evron · Edward Moroshko · Yacov Crammer -
2017 Poster: Shallow Updates for Deep Reinforcement Learning »
Nir Levine · Tom Zahavy · Daniel J Mankowitz · Aviv Tamar · Shie Mannor -
2016 Poster: Adaptive Skills Adaptive Partitions (ASAP) »
Daniel J Mankowitz · Timothy A Mann · Shie Mannor -
2015 Workshop: Machine Learning for (e-)Commerce »
Esteban Arcaute · Mohammad Ghavamzadeh · Shie Mannor · Georgios Theocharous -
2015 Poster: Online Learning for Adversaries with Memory: Price of Past Mistakes »
Oren Anava · Elad Hazan · Shie Mannor -
2015 Poster: Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach »
Yinlam Chow · Aviv Tamar · Shie Mannor · Marco Pavone -
2015 Poster: Linear Multi-Resource Allocation with Semi-Bandit Feedback »
Tor Lattimore · Yacov Crammer · Csaba Szepesvari -
2015 Poster: Policy Gradient for Coherent Risk Measures »
Aviv Tamar · Yinlam Chow · Mohammad Ghavamzadeh · Shie Mannor -
2015 Poster: Community Detection via Measure Space Embedding »
Mark Kozdoba · Shie Mannor -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: Learning Multiple Tasks in Parallel with a Shared Annotator »
Haim Cohen · Yacov Crammer -
2014 Poster: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Poster: Robust Logistic Regression and Classification »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2014 Oral: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2013 Workshop: Resource-Efficient Machine Learning »
Yevgeny Seldin · Yasin Abbasi Yadkori · Yacov Crammer · Ralf Herbrich · Peter Bartlett -
2013 Poster: Reinforcement Learning in Robust Markov Decision Processes »
Shiau Hong Lim · Huan Xu · Shie Mannor -
2013 Poster: Online PCA for Contaminated Data »
Jiashi Feng · Huan Xu · Shie Mannor · Shuicheng Yan -
2013 Poster: Learning Multiple Models via Regularized Weighting »
Daniel Vainsencher · Shie Mannor · Huan Xu -
2012 Workshop: Multi-Trade-offs in Machine Learning »
Yevgeny Seldin · Guy Lever · John Shawe-Taylor · Nicolò Cesa-Bianchi · Yacov Crammer · Francois Laviolette · Gabor Lugosi · Peter Bartlett -
2012 Poster: The Perturbed Variation »
Maayan Harel · Shie Mannor -
2012 Poster: Volume Regularization for Binary Classification »
Yacov Crammer · Tal Wagner -
2012 Spotlight: Volume Regularization for Binary Classification »
Yacov Crammer · Tal Wagner -
2012 Poster: Learning Multiple Tasks using Shared Hypotheses »
Yacov Crammer · Yishay Mansour -
2011 Workshop: New Frontiers in Model Order Selection »
Yevgeny Seldin · Yacov Crammer · Nicolò Cesa-Bianchi · Francois Laviolette · John Shawe-Taylor -
2011 Poster: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Spotlight: From Bandits to Experts: On the Value of Side-Observations »
Shie Mannor · Ohad Shamir -
2011 Poster: Committing Bandits »
Loc X Bui · Ramesh Johari · Shie Mannor -
2010 Spotlight: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Online Classification with Specificity Constraints »
Andrey Bernstein · Shie Mannor · Nahum Shimkin -
2010 Poster: Distributionally Robust Markov Decision Processes »
Huan Xu · Shie Mannor -
2010 Poster: Learning via Gaussian Herding »
Yacov Crammer · Daniel Lee -
2010 Poster: New Adaptive Algorithms for Online Classification »
Francesco Orabona · Yacov Crammer -
2009 Workshop: Advances in Ranking »
Shivani Agarwal · Chris J Burges · Yacov Crammer -
2009 Poster: Adaptive Regularization of Weight Vectors »
Yacov Crammer · Alex Kulesza · Mark Dredze -
2009 Spotlight: Adaptive Regularization of Weight Vectors »
Yacov Crammer · Alex Kulesza · Mark Dredze -
2008 Session: Oral session 6: Neural Coding »
Yacov Crammer -
2008 Poster: Exact Convex Confidence-Weighted Learning »
Yacov Crammer · Mark Dredze · Fernando Pereira -
2008 Spotlight: Exact Convex Confidence-Weighted Learning »
Yacov Crammer · Mark Dredze · Fernando Pereira -
2007 Poster: Learning Bounds for Domain Adaptation »
John Blitzer · Yacov Crammer · Alex Kulesza · Fernando Pereira · Jennifer Wortman Vaughan -
2006 Poster: Learning from Multiple Sources »
Yacov Crammer · Michael Kearns · Jennifer Wortman Vaughan -
2006 Poster: Analysis of Representations for Domain Adaptation »
John Blitzer · Shai Ben-David · Yacov Crammer · Fernando Pereira