Skip to yearly menu bar Skip to main content


Poster Wed, Dec 3, 2025 • 11:00 AM – 2:00 PM PST

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

Zhongwei Wan ⋅ Zhihao Dou ⋅ Che Liu ⋅ Yu Zhang ⋅ Dongfei Cui ⋅ Qinjian Zhao ⋅ Hui Shen ⋅ Jing Xiong ⋅ Yi Xin ⋅ Yifan Jiang ⋅ Chaofan Tao ⋅ Yangfan He ⋅ Mi Zhang ⋅ Shen Yan

Abstract

Video

Chat is not available.