Skip to yearly menu bar Skip to main content


Poster

Online Iterative Reinforcement Learning from Human Feedback with General Preference Model

Chenlu Ye · Wei Xiong · Yuheng Zhang · Hanze Dong · Nan Jiang · Tong Zhang
2024 Poster

Abstract

Video

Chat is not available.