Skip to yearly menu bar Skip to main content


Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

Noam Razin ⋅ Sadhika Malladi ⋅ Adithya Bhaskar ⋅ Danqi Chen ⋅ Sanjeev Arora ⋅ Boris Hanin

Abstract

Chat is not available.