Skip to yearly menu bar Skip to main content


Poster

Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment

Jiaxiang Li · Siliang Zeng · Hoi-To Wai · Chenliang Li · Alfredo Garcia · Mingyi Hong
2024 Poster

Abstract

Video

Chat is not available.