Skip to yearly menu bar Skip to main content


PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier

Yuhua Jiang · Yuwen Xiong · Yufeng Yuan · Chao Xin · Wenyuan Xu · Yu Yue · Qianchuan Zhao · Lin Yan

Abstract

Chat is not available.