Skip to yearly menu bar Skip to main content


Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking

Paria Rashidinejad · Yuandong Tian

Abstract

Chat is not available.