Skip to yearly menu bar Skip to main content


One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

Yiyuan Li ⋅ Zhen Huang ⋅ Yanan Wu ⋅ Weixun Wang ⋅ Xuefeng Li ⋅ Yijia Luo ⋅ Pengfei Liu ⋅ Wenbo Su ⋅ Bo Zheng

Abstract

Chat is not available.