Skip to yearly menu bar Skip to main content


Self-Play Preference Optimization for Language Model Alignment

Yue Wu ⋅ Zhiqing Sun ⋅ Huizhuo Yuan ⋅ Kaixuan Ji ⋅ Yiming Yang ⋅ Quanquan Gu

Video

Chat is not available.