Skip to yearly menu bar Skip to main content


Poster Thu, Dec 4, 2025 • 11:00 AM – 2:00 PM PST

Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning

Wenlin Zhang ⋅ Xiangyang Li ⋅ Kuicai Dong ⋅ Yichao Wang ⋅ Pengyue Jia ⋅ Xiaopeng Li ⋅ Yingyi Zhang ⋅ Derong Xu ⋅ Zhaocheng Du ⋅ Huifeng Guo ⋅ Ruiming Tang ⋅ Xiangyu Zhao

Abstract

Video

Chat is not available.