Timezone: »
The combinatorial stochastic semi-bandit problem is an extension of the classical multi-armed bandit problem in which an algorithm pulls more than one arm at each stage and the rewards of all pulled arms are revealed. One difference with the single arm variant is that the dependency structure of the arms is crucial. Previous works on this setting either used a worst-case approach or imposed independence of the arms. We introduce a way to quantify the dependency structure of the problem and design an algorithm that adapts to it. The algorithm is based on linear regression and the analysis uses techniques from the linear bandit literature. By comparing its performance to a new lower bound, we prove that it is optimal, up to a poly-logarithmic factor in the number of arms pulled.
Author Information
Rémy Degenne (Université Paris Diderot)
Vianney Perchet (Ensae - Criteo Labs)
More from the Same Authors
-
2021 Spotlight: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Spotlight: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2022 Poster: Top Two Algorithms Revisited »
Marc Jourdan · Rémy Degenne · Dorian Baudry · Rianne de Heide · Emilie Kaufmann -
2022 Poster: On Elimination Strategies for Bandit Fixed-Confidence Identification »
Andrea Tirinzoni · Rémy Degenne -
2022 Poster: Active Labeling: Streaming Stochastic Gradients »
Vivien Cabannes · Francis Bach · Vianney Perchet · Alessandro Rudi -
2021 Poster: Local Differential Privacy for Regret Minimization in Reinforcement Learning »
Evrard Garcelon · Vianney Perchet · Ciara Pike-Burke · Matteo Pirotta -
2021 Poster: ROI Maximization in Stochastic Online Decision-Making »
Nicolò Cesa-Bianchi · Tom Cesari · Yishay Mansour · Vianney Perchet -
2021 Poster: Making the most of your day: online learning for optimal allocation of time »
Etienne Boursier · Tristan Garrec · Vianney Perchet · Marco Scarsini -
2021 Poster: Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm »
Nathan Noiry · Vianney Perchet · Flore Sentenac -
2021 Poster: Decentralized Learning in Online Queuing Systems »
Flore Sentenac · Etienne Boursier · Vianney Perchet -
2020 Poster: Robustness of Community Detection to Random Geometric Perturbations »
Sandrine Peche · Vianney Perchet -
2020 Poster: Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits »
Pierre Perrault · Etienne Boursier · Michal Valko · Vianney Perchet -
2019 Poster: Categorized Bandits »
Matthieu Jedor · Vianney Perchet · Jonathan Louedec -
2019 Poster: Pure Exploration with Multiple Correct Answers »
Rémy Degenne · Wouter Koolen -
2019 Poster: SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits »
Etienne Boursier · Vianney Perchet -
2019 Spotlight: SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits »
Etienne Boursier · Vianney Perchet -
2019 Poster: Non-Asymptotic Pure Exploration by Solving Games »
Rémy Degenne · Wouter Koolen · Pierre Ménard -
2017 Poster: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet -
2017 Spotlight: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe »
Quentin Berthet · Vianney Perchet