Skip to yearly menu bar Skip to main content


Poster

When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback

Leon Lang · Davis Foote · Stuart J Russell · Anca Dragan · Erik Jenner · Scott Emmons
2024 Poster

Abstract

Video

Chat is not available.