Skip to yearly menu bar Skip to main content


Poster
in
Workshop: System-2 Reasoning at Scale

Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents

Quentin Delfosse ⋅ Sebastian Sztwiertnia ⋅ Mark Rothermel ⋅ Wolfgang Stammer ⋅ Kristian Kersting

Abstract

Chat is not available.