Skip to yearly menu bar Skip to main content


Search All 2022 Events
 

10 Results

<<   <   Page 1 of 1   >>   >
Workshop
Disclosing the Biases in Large Language Models via Reward Structured Questions
Ezgi Korkmaz
Workshop
Revealing the Bias in Large Language Models via Reward Structured Questions
Ezgi Korkmaz
Workshop
Revealing the Bias in Large Language Models via Reward Structured Questions
Ezgi Korkmaz
Poster
Thu 9:00 On Reinforcement Learning and Distribution Matching for Fine-Tuning Language Models with no Catastrophic Forgetting
Tomasz Korbak · Hady Elsahar · Germán Kruszewski · Marc Dymetman
Poster
Wed 14:00 Hedging as Reward Augmentation in Probabilistic Graphical Models
Debarun Bhattacharjya · Radu Marinescu
Poster
Tue 9:00 Fine-tuning language models to find agreement among humans with diverse preferences
Michiel Bakker · Martin Chadwick · Hannah Sheahan · Michael Tessler · Lucy Campbell-Gillingham · Jan Balaguer · Nat McAleese · Amelia Glaese · John Aslanides · Matt Botvinick · Christopher Summerfield
Poster
Thu 9:00 Defining and Characterizing Reward Gaming
Joar Skalse · Nikolaus Howe · Dmitrii Krasheninnikov · David Krueger
Poster
Wed 14:00 Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning
Joseph Early · Tom Bewley · Christine Evers · Sarvapali Ramchurn
Poster
Wed 9:00 Trade-off between Payoff and Model Rewards in Shapley-Fair Collaborative Machine Learning
Quoc Phong Nguyen · Bryan Kian Hsiang Low · Patrick Jaillet
Poster
Thu 9:00 Learning General World Models in a Handful of Reward-Free Deployments
Yingchen Xu · Jack Parker-Holder · Aldo Pacchiano · Philip Ball · Oleh Rybkin · S Roberts · Tim Rocktäschel · Edward Grefenstette