Skip to yearly menu bar Skip to main content


A Theoretical Framework for Auxiliary-Loss-Free Load-Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

X.Y. Han · Yuan Zhong

Abstract

Chat is not available.