Skip to yearly menu bar Skip to main content


Poster

Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling

Nguyen Phuc ⋅ Ngoc-Hieu Nguyen ⋅ Duy M. H. Nguyen ⋅ Anji Liu ⋅ An Mai ⋅ Thanh Binh Nguyen ⋅ Daniel Sonntag ⋅ Khoa D Doan
2025 Poster

Abstract

Video

Chat is not available.