Poster
in
Workshop: Safe and Robust Control of Uncertain Systems

Safe Online Exploration with Nonlinear Constraints

Eleanor Quint ⋅ Garrett Wirka ⋅ Stephen Scott

Abstract

Safe exploration is critical to using reinforcement learning in complex, hazardous, real-world environments for which offline data aren't available. We propose a nonlinear safety layer that, unlike prior work, requires no restrictions on the policy or environment, and doesn't require offline training. We demonstrate that a nonlinear model has higher prediction accuracy than a similar linear model and that a linear safety layer fails to learn a non-conservative policy in Safety Gym environments where the nonlinear layer does not.

Video

Chat is not available.