firstbacksecondback
29 Results
Workshop
|
Reward Model Aggregation Zihao Wang · Chirag Nagpal · Alexander D'Amour · Victor Veitch · Sanmi Koyejo |
||
Workshop
|
Disclosing the Biases in Large Language Models via Reward Based Questioning Ezgi Korkmaz |
||
Workshop
|
Fri 14:00 |
Eureka: Human-Level Reward Design via Coding Large Language Models Jason Ma |
|
Workshop
|
Reward Model Ensembles Help Mitigate Overoptimization Thomas Coste · Usman Anwar · Robert Kirk · David Krueger |
||
Poster
|
Tue 8:45 |
MIMEx: Intrinsic Rewards from Masked Input Modeling Toru Lin · Allan Jabri |
|
Workshop
|
Reward Model Ensembles Help Mitigate Overoptimization Thomas Coste · Usman Anwar · Robert Kirk · David Krueger |
||
Workshop
|
Confronting Reward Model Overoptimization with Constrained RLHF Ted Moskovitz · Aaditya Singh · DJ Strouse · Tuomas Sandholm · Russ Salakhutdinov · Anca Dragan · Stephen McAleer |
||
Workshop
|
Confronting Reward Model Overoptimization with Constrained RLHF Ted Moskovitz · Aaditya Singh · DJ Strouse · Tuomas Sandholm · Russ Salakhutdinov · Anca Dragan · Stephen McAleer |
||
Workshop
|
An Emulator for Fine-tuning Large Language Models using Small Language Models Eric Mitchell · Rafael Rafailov · Archit Sharma · Chelsea Finn · Christopher D Manning |
||
Poster
|
Wed 8:45 |
Extracting Reward Functions from Diffusion Models Felipe Nuti · Tim Franzmeyer · João Henriques |
|
Workshop
|
FoMo rewards: Casting foundation models as generic reward functions Ekdeep S Lubana · Pim de Haan · Taco Cohen · Johann Brehmer |
||
Workshop
|
Fri 12:50 |
#28: Canonical Design for Language Agents using Natural Language Reward Models Silviu Pitis · Ziang Xiao · Alessandro Sordoni |