Skip to yearly menu bar Skip to main content


Poster

Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling

Jiahao Wang ⋅ Weiye Xu ⋅ Aijun Yang ⋅ Wengang Zhou ⋅ Lewei Lu ⋅ Houqiang Li ⋅ Xiaohua Wang ⋅ Jinguo Zhu
2025 Poster

Abstract

Video

Chat is not available.