Timezone: »
We consider the problem of robust and adaptive model predictive control (MPC) of a linear system, with unknown parameters that are learned along the way (adaptive), in a critical setting where failures must be prevented (robust). This problem has been studied from different perspectives by different communities. However, the existing theory deals only with the case of quadratic costs (the LQ problem), which limits applications to stabilisation and tracking tasks only. In order to handle more general (non-convex) costs that naturally arise in many practical problems, we carefully select and bring together several tools from different communities, namely non-asymptotic linear regression, recent results in interval prediction, and tree-based planning. Combining and adapting the theoretical guarantees at each layer is non trivial, and we provide the first end-to-end suboptimality analysis for this setting. Interestingly, our analysis naturally adapts to handle many models and combines with a data-driven robust model selection strategy, which enables to relax the modelling assumptions. Last, we strive to preserve tractability at any stage of the method, that we illustrate on two challenging simulated environments.
Author Information
Edouard Leurent (INRIA)
PhD student in Reinforcement Learning, at: - INRIA SequeL project for sequential learning - INRIA Non-A project for finite-time control - Renault Group
Odalric-Ambrym Maillard (INRIA)
Denis Efimov (Inria)
Related Events (a corresponding poster, oral, or spotlight)
-
2020 Oral: Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs »
Thu. Dec 10th 02:00 -- 02:15 PM Room Orals & Spotlights: Reinforcement Learning
More from the Same Authors
-
2021 Spotlight: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2022 Poster: IMED-RL: Regret optimal learning of ergodic Markov decision processes »
Fabien Pesquerel · Odalric-Ambrym Maillard -
2022 Poster: Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits »
Lilian Besson · Emilie Kaufmann · Odalric-Ambrym Maillard · Julien Seznec -
2021 Poster: Stochastic bandits with groups of similar arms. »
Fabien Pesquerel · Hassan SABER · Odalric-Ambrym Maillard -
2021 Poster: Indexed Minimum Empirical Divergence for Unimodal Bandits »
Hassan SABER · Pierre Ménard · Odalric-Ambrym Maillard -
2021 Poster: Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2021 Poster: From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits »
Dorian Baudry · Patrick Saux · Odalric-Ambrym Maillard -
2021 Poster: Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits »
Reda Ouhamma · Odalric-Ambrym Maillard · Vianney Perchet -
2020 Poster: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Spotlight: Sub-sampling for Efficient Non-Parametric Bandit Exploration »
Dorian Baudry · Emilie Kaufmann · Odalric-Ambrym Maillard -
2020 Poster: Planning in Markov Decision Processes with Gap-Dependent Sample Complexity »
Anders Jonsson · Emilie Kaufmann · Pierre Menard · Omar Darwiche Domingues · Edouard Leurent · Michal Valko -
2019 : Coffee + Posters »
Changhao Chen · Nils Gählert · Edouard Leurent · Johannes Lehner · Apratim Bhattacharyya · Harkirat Singh Behl · Teck Yian Lim · Shiho Kim · Jelena Novosel · Błażej Osiński · Arindam Das · Ruobing Shen · Jeffrey Hawke · Joachim Sicking · Babak Shahian Jahromi · Theja Tulabandhula · Claudio Michaelis · Evgenia Rusak · WENHANG BAO · Hazem Rashed · JP Chen · Amin Ansari · Jaekwang Cha · Mohamed Zahran · Daniele Reda · Jinhyuk Kim · Kim Dohyun · Ho Suk · Junekyo Jhung · Alexander Kister · Matthias Fahrland · Adam Jakubowski · Piotr Miłoś · Jean Mercat · Bruno Arsenali · Silviu Homoceanu · Xiao-Yang Liu · Philip Torr · Ahmad El Sallab · Ibrahim Sobh · Anurag Arnab · Krzysztof Galias -
2019 Poster: Budgeted Reinforcement Learning in Continuous State Space »
Nicolas Carrara · Edouard Leurent · Romain Laroche · Tanguy Urvoy · Odalric-Ambrym Maillard · Olivier Pietquin -
2019 Poster: Learning Multiple Markov Chains via Adaptive Allocation »
Mohammad Sadegh Talebi · Odalric-Ambrym Maillard -
2019 Poster: Regret Bounds for Learning State Representations in Reinforcement Learning »
Ronald Ortner · Matteo Pirotta · Alessandro Lazaric · Ronan Fruit · Odalric-Ambrym Maillard -
2018 : Poster Session »
Zihan Ding · David Mguni · Yuzheng Zhuang · Edouard Leurent · Takuma Oda · Yulia Tachibana · Paweł Gora · Neema Davis · Nemanja Djuric · Fang-Chieh Chou · elmira amirloo -
2014 Workshop: From Bad Models to Good Policies (Sequential Decision Making under Uncertainty) »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor · Jeremie Mary · Laurent Orseau · Thomas Dietterich · Ronald Ortner · Peter Grünwald · Joelle Pineau · Raphael Fonteneau · Georgios Theocharous · Esteban D Arcaute · Christos Dimitrakakis · Nan Jiang · Doina Precup · Pierre-Luc Bacon · Marek Petrik · Aviv Tamar -
2014 Poster: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2014 Oral: "How hard is my MDP?" The distribution-norm to the rescue »
Odalric-Ambrym Maillard · Timothy A Mann · Shie Mannor -
2012 Poster: Online allocation and homogeneous partitioning for piecewise constant mean-approximation »
Alexandra Carpentier · Odalric-Ambrym Maillard -
2012 Poster: Hierarchical Optimistic Region Selection driven by Curiosity »
Odalric-Ambrym Maillard -
2011 Poster: Selecting the State-Representation in Reinforcement Learning »
Odalric-Ambrym Maillard · Remi Munos · Daniil Ryabko -
2011 Poster: Sparse Recovery with Brownian Sensing »
Alexandra Carpentier · Odalric-Ambrym Maillard · Remi Munos -
2010 Spotlight: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2010 Poster: LSTD with Random Projections »
Mohammad Ghavamzadeh · Alessandro Lazaric · Odalric-Ambrym Maillard · Remi Munos -
2010 Poster: Scrambled Objects for Least-Squares Regression »
Odalric-Ambrym Maillard · Remi Munos -
2009 Poster: Compressed Least-Squares Regression »
Odalric-Ambrym Maillard · Remi Munos