Skip to yearly menu bar Skip to main content


Poster

The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models

Conglong Li · Minjia Zhang · Yuxiong He
2022 Poster
[ Paper [ Poster [ OpenReview

Abstract

Video

Chat is not available.