Timezone: »
We explore reinforcement learning methods for finding the optimal policy in the nearly linear-quadratic control systems. In particular, we consider a dynamic system composed of the summation of a linear and a nonlinear components, which is governed by a policy with the same structure. Assuming that the nonlinear part consists of kernels with small Lipschitz coefficients, we characterize the optimization landscape of the cost function. While the resulting landscape is generally nonconvex, we show local strong convexity and smoothness of the cost function around the global optimizer. In addition, we design a policy gradient algorithm with a carefully chosen initialization and prove that the algorithm is guaranteed to converge to the globally optimal policy with a linear rate.
Author Information
Yinbin Han (University of Southern California)
Meisam Razaviyayn (University of Southern California)
Renyuan Xu (University of Southern California)
More from the Same Authors
-
2021 : Private Federated Learning Without a Trusted Server: Optimal Algorithms for Convex Losses »
Andrew Lowy · Meisam Razaviyayn -
2022 : Private Stochastic Optimization With Large Worst-Case Lipschitz Parameter: Optimal Rates for (Non-Smooth) Convex Losses & Extension to Non-Convex Losses »
Andrew Lowy · Meisam Razaviyayn -
2022 : A Stochastic Optimization Framework for Fair Risk Minimization »
Andrew Lowy · Sina Baharlouei · Rakesh Pavan · Meisam Razaviyayn · Ahmad Beirami -
2022 : Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes »
Sina Baharlouei · Fatemeh Sheikholeslami · Meisam Razaviyayn · J. Zico Kolter -
2022 : Poster Session 2 »
Jinwuk Seok · Bo Liu · Ryotaro Mitsuboshi · David Martinez-Rubio · Weiqiang Zheng · Ilgee Hong · Chen Fan · Kazusato Oko · Bo Tang · Miao Cheng · Aaron Defazio · Tim G. J. Rudner · Gabriele Farina · Vishwak Srinivasan · Ruichen Jiang · Peng Wang · Jane Lee · Nathan Wycoff · Nikhil Ghosh · Yinbin Han · David Mueller · Liu Yang · Amrutha Varshini Ramesh · Siqi Zhang · Kaifeng Lyu · David Yunis · Kumar Kshitij Patel · Fangshuo Liao · Dmitrii Avdiukhin · Xiang Li · Sattar Vakili · Jiaxin Shi -
2022 : Stochastic Differentially Private and Fair Learning »
Andrew Lowy · Devansh Gupta · Meisam Razaviyayn -
2020 Poster: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2020 Spotlight: Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems »
Songtao Lu · Meisam Razaviyayn · Bo Yang · Kejun Huang · Mingyi Hong -
2019 Poster: Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods »
Maher Nouiehed · Maziar Sanjabi · Tianjian Huang · Jason Lee · Meisam Razaviyayn -
2018 Poster: On the Convergence and Robustness of Training GANs with Regularized Optimal Transport »
Maziar Sanjabi · Jimmy Ba · Meisam Razaviyayn · Jason Lee -
2017 Poster: On Optimal Generalizability in Parametric Learning »
Ahmad Beirami · Meisam Razaviyayn · Shahin Shahrampour · Vahid Tarokh