Skip to yearly menu bar Skip to main content


LightSeq: : Sequence Level Parallelism for Distributed Training of Long Context Transformers

Dacheng Li · Rulin Shao · Anze Xie · Eric Xing · Joseph Gonzalez · Ion Stoica · Xuezhe Ma · Hao Zhang

Abstract

Video

Chat is not available.