Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations
Yiheng Lin · James A. Preiss · Emile Anand · Yingying Li · Yisong Yue · Adam Wierman

Thu Dec 14 08:45 AM -- 10:45 AM (PST) @ Great Hall & Hall B1+B2 #1113

We study online adaptive policy selection in systems with time-varying costs and dynamics. We develop the Gradient-based Adaptive Policy Selection (GAPS) algorithm together with a general analytical framework for online policy selection via online optimization. Under our proposed notion of contractive policy classes, we show that GAPS approximates the behavior of an ideal online gradient descent algorithm on the policy parameters while requiring less information and computation. When convexity holds, our algorithm is the first to achieve optimal policy regret. When convexity does not hold, we provide the first local regret bound for online policy selection. Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks.

Author Information

Yiheng Lin (California Institute of Technology)
James A. Preiss (Caltech)
Emile Anand (California Institute of Technology)
Emile Anand

Hello there! My name is Emile, and I am a research fellow at CMU for the academic year 2023-2024, hosted by Guannan Qu. My research interests lie in theoretical computer science and theoretical machine learning. Recently, I've been thinking about densely networked multiagent reinforcement learning, Markov decision processes, and online balanced descent. I'm also interested in problems related to fast fine-grained algorithms, Fourier analysis, and pseudorandomness. Feel free to connect with me. I welcome any comments or questions you might have about my research, and am always eager to discuss and share ideas! Previously, I did my undergrad at Caltech, where I worked with the Theory of Computing Group and the Rigorous Systems Research Group. I was fortunate to be mentored by Professors Chris Umans and Adam Wierman, and advised by Katie Bouman.

Yingying Li (California Institute of Technology)
Yisong Yue (Caltech)
Adam Wierman (Caltech)

