Skip to yearly menu bar Skip to main content


Poster Thu, Dec 4, 2025 • 11:00 AM – 2:00 PM PST

Offline RL by Reward-Weighted Fine-Tuning for Conversation Optimization

Subhojyoti Mukherjee ⋅ Viet Lai ⋅ Raghavendra Addanki ⋅ Ryan Rossi ⋅ Seunghyun Yoon ⋅ Trung Bui ⋅ Anup B. Rao ⋅ Jayakumar Subramanian ⋅ Branislav Kveton

Abstract

Video

Chat is not available.