Skip to yearly menu bar Skip to main content


Poster

The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models

Conglong Li ⋅ Minjia Zhang ⋅ Yuxiong He
2022 Poster
[ Paper [ Poster [ OpenReview

Abstract

Video

Chat is not available.