Skip to yearly menu bar Skip to main content


What Makes a Reward Model a Good Teacher? An Optimization Perspective

Noam Razin ⋅ Zixuan Wang ⋅ Hubert Strauss ⋅ Stanley Wei ⋅ Jason Lee ⋅ Sanjeev Arora

Abstract

Chat is not available.