Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

35 Results

<<   <   Page 2 of 3   >   >>
Workshop
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong · Arvind Mahankali · Tengyu Ma
Workshop
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack
Leo McKee-Reid · Christoph Sträter · Maria Martinez · Joe Needham · Mikita Balesni
Poster
Thu 11:00 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
Rui Yang · Ruomeng Ding · Yong Lin · Huan Zhang · Tong Zhang
Workshop
Improving LLM Generation with Inverse and Forward Alignment: Reward Modeling, Prompting, Fine-Tuning, and Inference-Time Optimization
Hao Sun · Thomas Pouplin · Nicolás Astorga · Tennison Liu · Mihaela van der Schaar
Workshop
Improving LLM Generation with Inverse and Forward Alignment: Reward Modeling, Prompting, Fine-Tuning, and Inference-Time Optimization
Hao Sun · Thomas Pouplin · Nicolás Astorga · Tennison Liu · Mihaela van der Schaar
Workshop
Mechanism Design for LLM Fine-tuning with Multiple Reward Models
Haoran Sun · Yurong Chen · Siwei Wang · Wei Chen · Xiaotie Deng
Poster
Thu 11:00 Learning Goal-Conditioned Representations for Language Reward Models
Vaskar Nath · Dylan Slack · Jeff Da · Yuntao Ma · Hugh Zhang · Spencer Whitehead · Sean Hendryx
Poster
Thu 16:30 Calibrated Self-Rewarding Vision Language Models
Yiyang Zhou · Zhiyuan Fan · Dongjie Cheng · Sihan Yang · Zhaorun Chen · Chenhang Cui · Xiyao Wang · Yun Li · Linjun Zhang · Huaxiu Yao
Workshop
Linear Probe Penalties Reduce LLM Sycophancy
Henry Papadatos · Rachel Freedman
Workshop
Critique-out-Loud Reward Models
Zachary Ankner · Mansheej Paul · Brandon Cui · Jonathan Chang · Prithviraj Ammanabrolu
Workshop
S2L-RM: Short-to-Long Reward Modeling
Changyu CHEN · Zichen Liu · Haonan Wang · Chao Du · Tianyu Pang · Qian Liu · Arunesh Sinha · Pradeep Varakantham · Min Lin
Poster
Fri 11:00 Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh · Yifan Hu · Iason Chaimalas · Viraj Mehta · Pier Giuseppe Sessa · Haitham Bou Ammar · Ilija Bogunovic