Timezone: »
Poster
Making Non-Stochastic Control (Almost) as Easy as Stochastic
Max Simchowitz
Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem where a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i.i.d. Gaussian noise. \iftoggle{nips}{The}{It is now understood that the} optimal regret over time horizon $T$ against the optimal control law scales as $\widetilde{\Theta}(\sqrt{T})$. In this paper, we show that the same regret rate (against a suitable benchmark) is attainable even in the considerably more general non-stochastic control model, where the system is driven by \emph{arbitrary adversarial} noise \citep{agarwal2019online}.
We attain the optimal $\widetilde{\mathcal{O}}(\sqrt{T})$ regret when the dynamics are unknown to the learner, and $\mathrm{poly}(\log T)$ regret when known, provided that the cost functions are strongly convex (as in LQR). Our algorithm is based on a novel variant of online Newton step \citep{hazan2007logarithmic}, which adapts to the geometry induced by adversarial disturbances, and our analysis hinges on generic regret bounds for certain structured losses in the OCO-with-memory framework \citep{anava2015online}.
Author Information
Max Simchowitz (Berkeley)
More from the Same Authors
-
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2021 : Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2021 : Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2022 : Learning to Extrapolate: A Transductive Approach »
Aviv Netanyahu · Abhishek Gupta · Max Simchowitz · Kaiqing Zhang · Pulkit Agrawal -
2022 Poster: Efficient and Near-Optimal Smoothed Online Learning for Generalized Linear Functions »
Adam Block · Max Simchowitz -
2022 Poster: Globally Convergent Policy Search for Output Estimation »
Jack Umenberger · Max Simchowitz · Juan Perdomo · Kaiqing Zhang · Russ Tedrake -
2021 : Spotlight 1: Exploration and Incentives in Reinforcement Learning »
Max Simchowitz · Aleksandrs Slivkins -
2021 Poster: Online Control of Unknown Time-Varying Dynamical Systems »
Edgar Minasyan · Paula Gradu · Max Simchowitz · Elad Hazan -
2021 Poster: Stabilizing Dynamical Systems via Policy Gradient Methods »
Juan Perdomo · Jack Umenberger · Max Simchowitz -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 Poster: Learning the Linear Quadratic Regulator from Nonlinear Observations »
Zakaria Mhammedi · Dylan Foster · Max Simchowitz · Dipendra Misra · Wen Sun · Akshay Krishnamurthy · Alexander Rakhlin · John Langford -
2020 Poster: Constrained episodic reinforcement learning in concave-convex and knapsack settings »
Kianté Brantley · Miro Dudik · Thodoris Lykouris · Sobhan Miryoosefi · Max Simchowitz · Aleksandrs Slivkins · Wen Sun -
2019 Poster: Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs »
Max Simchowitz · Kevin Jamieson