firstbacksecondback
54 Results
Poster
|
Fri 11:00 |
Active preference learning for ordering items in- and out-of-sample Herman Bergström · Emil Carlsson · Devdatt Dubhashi · Fredrik Johansson |
|
Poster
|
Wed 11:00 |
Preference Learning of Latent Decision Utilities with a Human-like Model of Preferential Choice Sebastiaan De Peuter · Shibei Zhu · Yujia Guo · Andrew Howes · Samuel Kaski |
|
Workshop
|
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences Daiwei Chen · Yi Chen · Aniket Rege · Ramya Korlakai Vinayak |
||
Workshop
|
Best Unpacking DPO and PPO: Disentangling Practices for Learning from Preference Feedback Hamish Ivison · Yizhong Wang · Jiacheng Liu · Zeqiu Wu · Valentina Pyatkin · Nathan Lambert · Noah Smith · Yejin Choi · Hannaneh Hajishirzi |
||
Poster
|
Wed 16:30 |
Optimal Design for Human Preference Elicitation Subhojyoti Mukherjee · Anusha Lalitha · Kousha Kalantari · Aniket Anand Deshmukh · Ge Liu · Yifei Ma · Branislav Kveton |
|
Poster
|
Thu 16:30 |
Preference Learning Algorithms Do Not Learn Preference Rankings Angelica Chen · Sadhika Malladi · Lily Zhang · Xinyi Chen · Qiuyi (Richard) Zhang · Rajesh Ranganath · Kyunghyun Cho |
|
Workshop
|
Optimizing Reward Models with Proximal Policy Exploration in Preference-Based Reinforcement Learning Yiwen Zhu · Jinyi Liu · Yifu Yuan · Wenya Wei · Zhenxing Ge · qianyi fu · Zhou Fang · Yujing Hu · Bo An |
||
Poster
|
Thu 16:30 |
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback Hamish Ivison · Yizhong Wang · Jiacheng Liu · Zeqiu Wu · Valentina Pyatkin · Nathan Lambert · Noah Smith · Yejin Choi · Hanna Hajishirzi |
|
Poster
|
Thu 16:30 |
Aligning LLM Agents by Learning Latent Preference from User Edits Ge Gao · Alexey Taymanov · Eduardo Salinas · Paul Mineiro · Dipendra Misra |
|
Workshop
|
Sat 10:05 |
Estimating Effects of Tokens in Preference Learning Hsiao-Ru Pan · Maximilian Mordig · Bernhard Schölkopf |
|
Workshop
|
Estimating Effects of Tokens in Preference Learning Hsiao-Ru Pan · Maximilian Mordig · Bernhard Schölkopf |
||
Workshop
|
Estimating Effects of Tokens in Preference Learning Hsiao-Ru Pan · Maximilian Mordig · Bernhard Schölkopf |