Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

35 Results

<<   <   Page 3 of 3   >>   >
Workshop
Cream: Consistency Regularized Self-Rewarding Language Models
Zhaoyang Wang · Weilei He · Zhiyuan Liang · Xuchao Zhang · Chetan Bansal · Ying Wei · Weitong Zhang · Huaxiu Yao
Workshop
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang · Arian Hosseini · Hritik Bansal · Mehran Kazemi · Aviral Kumar · Rishabh Agarwal
Workshop
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack
Leo McKee-Reid · Joe Needham · Maria Martinez · Christoph Sträter · Mikita Balesni
Poster
Fri 11:00 Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov · Yaswanth Chittepu · Ryan Park · Harshit Sushil Sikchi · Joey Hejna · Brad Knox · Chelsea Finn · Scott Niekum
Workshop
Beyond the Binary: Capturing Diverse Preferences With Reward Regularization
Vishakh Padmakumar · Chuanyang Jin · Hannah Rose Kirk · He He
Workshop
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang · Arian Hosseini · Hritik Bansal · Mehran Kazemi · Aviral Kumar · Rishabh Agarwal
Poster
Fri 16:30 Rule Based Rewards for Language Model Safety
Tong Mu · Alec Helyar · Johannes Heidecke · Joshua Achiam · Andrea Vallone · Ian Kivlichan · Molly Lin · Alex Beutel · John Schulman · Lilian Weng
Workshop
Enhancing Multi-Agent Multi-Modal Collaboration with Fine-Grained Reward Modeling
Qian Yang · Weixiang Yan · Aishwarya Agrawal
Poster
Thu 11:00 ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization
Luca Eyring · Shyamgopal Karthik · Karsten Roth · Alexey Dosovitskiy · Zeynep Akata
Workshop
Optimizing Reward Models with Proximal Policy Exploration in Preference-Based Reinforcement Learning
Yiwen Zhu · Jinyi Liu · Yifu Yuan · Wenya Wei · Zhenxing Ge · qianyi fu · Zhou Fang · Yujing Hu · Bo An
Poster
Thu 16:30 HelpSteer 2: Open-source dataset for training top-performing reward models
Zhilin Wang · Yi Dong · Olivier Delalleau · Jiaqi Zeng · Gerald Shen · Daniel Egert · Jimmy Zhang · Makesh Narsimhan Sreedhar · Oleksii Kuchaiev