NeurIPS Poster Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

Poster

Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

Xingang Guo · Bin Hu

Hall J (level 1) #1034

Keywords: [ policy gradient ] [ Reinforcement Learning ] [ robust control ] [ nonsmooth optimization ]

[ Abstract ]

[ Paper] [ Poster] [ OpenReview]

Abstract: Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood. The optimal

H_{\infty}

$\mathcal{H}_\infty$ control framework aims at designing a policy to minimize the closed-loop

H_{\infty}

$\mathcal{H}_\infty$ norm, and is arguably the most fundamental robust control paradigm. In this work, we show that direct policy search is guaranteed to find the global solution of the robust

H_{\infty}

$\mathcal{H}_\infty$ state-feedback control design problem. Notice that policy search for optimal

H_{\infty}

$\mathcal{H}_\infty$ control leads to a constrained nonconvex nonsmooth optimization problem, where the nonconvex feasible set consists of all the policies stabilizing the closed-loop dynamics. We show that for this nonsmooth optimization problem, all Clarke stationary points are global minimum. Next, we identify the coerciveness of the closed-loop

H_{\infty}

$\mathcal{H}_\infty$ objective function, and prove that all the sublevel sets of the resultant policy search problem are compact. Based on these properties, we show that Goldstein's subgradient method and its implementable variants can be guaranteed to stay in the nonconvex feasible set and eventually find the global optimal solution of the

H_{\infty}

$\mathcal{H}_\infty$ state-feedback synthesis problem. Our work builds a new connection between nonconvex nonsmooth optimization theory and robust control, leading to an interesting global convergence result for direct policy search on optimal

H_{\infty}

$\mathcal{H}_\infty$ synthesis.

Chat is not available.

Poster

Global Convergence of Direct Policy Search for State-Feedback H∞H∞\mathcal{H}_\infty Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential

Xingang Guo · Bin Hu

Hall J (level 1) #1034

Global Convergence of Direct Policy Search for State-Feedback $\mathcal{H}_\infty$ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential