Skip to yearly menu bar Skip to main content


Spotlight Poster

DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization

Jiacai Liu ⋅ Chaojie Wang ⋅ Chris Liu ⋅ Liang Zeng ⋅ Rui Yan ⋅ Yiwen Sun ⋅ Yang Liu
2025 Spotlight Poster

Abstract

Video

Chat is not available.