Skip to yearly menu bar Skip to main content


Poster Session
in
Workshop: NeurIPS 2025 Workshop on Embodied and Safe-Assured Robotic Systems
Sun, Nov 30, 2025 • 3:50 PM – 3:55 PM PST

Poster 2 The Horcrux: Mechanistically Interpretable Task Decomposition for Reward Hacking Detection

Video

Chat is not available.