Skip to yearly menu bar Skip to main content


Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

Maxime Heuillet ⋅ Yufei CUI ⋅ Boxing Chen ⋅ Audrey Durand ⋅ Prasanna Parthasarathi

Abstract

Chat is not available.