Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: ML for Systems

Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference (Spotlight Paper)

2025 Spotlight
in
Workshop: ML for Systems

Abstract

Video

Chat is not available.