Timezone: »
Poster
AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness
Dacheng Li · Hongyi Wang · Eric Xing · Hao Zhang
Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks. However, training big models requires strong distributed system expertise to carefully design model-parallel execution strategies that suit the model architectures and cluster setups. In this paper, we develop AMP, a framework that automatically derives such strategies. AMP identifies a valid space of model parallelism strategies and efficiently searches the space for high-performed strategies, by leveraging a cost model designed to capture the heterogeneity of the model and cluster specifications. Unlike existing methods, AMP is specifically tailored to support complex models composed of uneven layers and cluster setups with more heterogeneous accelerators and bandwidth. We evaluate AMP on popular modelsand cluster setups from public clouds and show that AMP returns parallel strategies that match the expert-tuned strategies on typical cluster setups. On heterogeneous clusters or models with heterogeneous architectures, AMP finds strategies with 1.54$\times$ and 1.77$\times$ higher throughput than state-of-the-art model-parallel systems, respectively.
Author Information
Dacheng Li (Carnegie Mellon University)
Hongyi Wang (CMU, Carnegie Mellon University)
Eric Xing (Petuum Inc.)
Hao Zhang (University of California, Berkeley)
More from the Same Authors
-
2021 : Geometric Question Answering Towards Multimodal Numerical Reasoning »
Jiaqi Chen · Jianheng Tang · Jinghui Qin · Xiaodan Liang · Lingbo Liu · Eric Xing · Liang Lin -
2022 : The Impact of Symbolic Representations on In-context Learning for Few-shot Reasoning »
Hanlin Zhang · yifan zhang · Li Erran Li · Eric Xing -
2022 : Betty: An Automatic Differentiation Library for Multilevel Optimization »
Sang Keun Choe · Willie Neiswanger · Pengtao Xie · Eric Xing -
2022 Spotlight: Masked Generative Adversarial Networks are Data-Efficient Generation Learners »
Jiaxing Huang · Kaiwen Cui · Dayan Guan · Aoran Xiao · Fangneng Zhan · Shijian Lu · Shengcai Liao · Eric Xing -
2022 Poster: Rare Gems: Finding Lottery Tickets at Initialization »
Kartik Sreenivasan · Jy-yong Sohn · Liu Yang · Matthew Grinde · Alliot Nagle · Hongyi Wang · Eric Xing · Kangwook Lee · Dimitris Papailiopoulos -
2022 Poster: Masked Generative Adversarial Networks are Data-Efficient Generation Learners »
Jiaxing Huang · Kaiwen Cui · Dayan Guan · Aoran Xiao · Fangneng Zhan · Shijian Lu · Shengcai Liao · Eric Xing -
2019 Poster: Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering »
Biwei Huang · Kun Zhang · Pengtao Xie · Mingming Gong · Eric Xing · Clark Glymour