Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ML for Systems

Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference

Ruokai Yin ⋅ Sattwik Mishra ⋅ Xuan Zuo ⋅ Hokchhay Tann ⋅ Apala Guha

Abstract

Chat is not available.