Skip to yearly menu bar Skip to main content


Poster

Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads

Avelina Hadji-Kyriacou · Ognjen Arandjelovic
2024 Poster

Abstract

Video

Chat is not available.