firstbacksecondback
35 Results
Workshop
|
Cream: Consistency Regularized Self-Rewarding Language Models Zhaoyang Wang · Weilei He · Zhiyuan Liang · Xuchao Zhang · Chetan Bansal · Ying Wei · Weitong Zhang · Huaxiu Yao |
||
Workshop
|
Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang · Arian Hosseini · Hritik Bansal · Mehran Kazemi · Aviral Kumar · Rishabh Agarwal |
||
Workshop
|
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Leo McKee-Reid · Joe Needham · Maria Martinez · Christoph Sträter · Mikita Balesni |
||
Poster
|
Fri 11:00 |
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Rafael Rafailov · Yaswanth Chittepu · Ryan Park · Harshit Sushil Sikchi · Joey Hejna · Brad Knox · Chelsea Finn · Scott Niekum |
|
Workshop
|
Beyond the Binary: Capturing Diverse Preferences With Reward Regularization Vishakh Padmakumar · Chuanyang Jin · Hannah Rose Kirk · He He |
||
Workshop
|
Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang · Arian Hosseini · Hritik Bansal · Mehran Kazemi · Aviral Kumar · Rishabh Agarwal |
||
Poster
|
Fri 16:30 |
Rule Based Rewards for Language Model Safety Tong Mu · Alec Helyar · Johannes Heidecke · Joshua Achiam · Andrea Vallone · Ian Kivlichan · Molly Lin · Alex Beutel · John Schulman · Lilian Weng |
|
Workshop
|
Enhancing Multi-Agent Multi-Modal Collaboration with Fine-Grained Reward Modeling Qian Yang · Weixiang Yan · Aishwarya Agrawal |
||
Poster
|
Thu 11:00 |
ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization Luca Eyring · Shyamgopal Karthik · Karsten Roth · Alexey Dosovitskiy · Zeynep Akata |
|
Workshop
|
Optimizing Reward Models with Proximal Policy Exploration in Preference-Based Reinforcement Learning Yiwen Zhu · Jinyi Liu · Yifu Yuan · Wenya Wei · Zhenxing Ge · qianyi fu · Zhou Fang · Yujing Hu · Bo An |
||
Poster
|
Thu 16:30 |
HelpSteer 2: Open-source dataset for training top-performing reward models Zhilin Wang · Yi Dong · Olivier Delalleau · Jiaqi Zeng · Gerald Shen · Daniel Egert · Jimmy Zhang · Makesh Narsimhan Sreedhar · Oleksii Kuchaiev |