Oral

OpenAssistant Conversations - Democratizing Large Language Model Alignment

Andreas Köpf ⋅ Yannic Kilcher ⋅ Dimitri von Rütte ⋅ Sotiris Anagnostidis ⋅ Zhi Rui Tam ⋅ Keith Stevens ⋅ Abdullah Barhoum ⋅ Duc Nguyen ⋅ Oliver Stanley ⋅ Richárd Nagyfi ⋅ Shahul ES ⋅ Sameer Suri ⋅ David Glushkov ⋅ Arnav Dantuluri ⋅ Andrew Maguire ⋅ Christoph Schuhmann ⋅ Huu Nguyen ⋅ Alexander Mattick

2023 Oral

[ OpenReview]

Abstract

Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT.Alignment techniques such as supervised fine-tuning (\textit{SFT}) and reinforcement learning from human feedback (\textit{RLHF}) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains.However, state-of-the-art alignment techniques like \textit{RLHF} rely on high-quality human feedback data, which is expensive to create and often remains proprietary.In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees.The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers.Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models.We release our code\footnote{\git} and data\footnote{\data} under a fully permissive licence.

Video

Chat is not available.