Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

29 Results

<<   <   Page 1 of 3   >   >>
Workshop
Reward Model Aggregation
Zihao Wang · Chirag Nagpal · Alexander D'Amour · Victor Veitch · Sanmi Koyejo
Workshop
Disclosing the Biases in Large Language Models via Reward Based Questioning
Ezgi Korkmaz
Workshop
Fri 14:00 Eureka: Human-Level Reward Design via Coding Large Language Models
Jason Ma
Workshop
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste · Usman Anwar · Robert Kirk · David Krueger
Poster
Tue 8:45 MIMEx: Intrinsic Rewards from Masked Input Modeling
Toru Lin · Allan Jabri
Workshop
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste · Usman Anwar · Robert Kirk · David Krueger
Workshop
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz · Aaditya Singh · DJ Strouse · Tuomas Sandholm · Russ Salakhutdinov · Anca Dragan · Stephen McAleer
Workshop
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz · Aaditya Singh · DJ Strouse · Tuomas Sandholm · Russ Salakhutdinov · Anca Dragan · Stephen McAleer
Workshop
An Emulator for Fine-tuning Large Language Models using Small Language Models
Eric Mitchell · Rafael Rafailov · Archit Sharma · Chelsea Finn · Christopher D Manning
Poster
Wed 8:45 Extracting Reward Functions from Diffusion Models
Felipe Nuti · Tim Franzmeyer · João Henriques
Workshop
FoMo rewards: Casting foundation models as generic reward functions
Ekdeep S Lubana · Pim de Haan · Taco Cohen · Johann Brehmer
Workshop
Fri 12:50 #28: Canonical Design for Language Agents using Natural Language Reward Models
Silviu Pitis · Ziang Xiao · Alessandro Sordoni