Skip to yearly menu bar Skip to main content


Poster

Robust Reinforcement Learning with General Utility

Ziyi Chen · Yan Wen · Zhengmian Hu · Heng Huang

West Ballroom A-D #6409
[ ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Reinforcement Learning (RL) problem with general utility is a powerful decision making framework that covers standard RL with cumulative cost, exploration problems and demonstration learning. Existing works on RL with general utility do not consider robustness under environmental perturbation, which is important to adapt RL system to the real-world environment that differs from the training environment. To train a robust policy, we propose a robust RL framework with general utility. For popular convex utility functions which yield a nonconvex-nonconcave minimax optimization problem, we design a two-phase stochastic policy gradient based algorithm and obtain its sample complexity result for gradient convergence. Furthermore, for convex utility on a widely used polyhedral ambiguity set, we design an algorithm and obtain its convergence rate to a global optimal solution. Finally, we also design algorithms with provable gradient convergence for concave utilities and utilities that satisfy weak Minty variational inequality.

Live content is unavailable. Log in and register to view live content