Timezone: »

On Model Parallelization and Scheduling Strategies for Distributed Machine Learning
Seunghak Lee · Jin Kyu Kim · Xun Zheng · Qirong Ho · Garth Gibson · Eric Xing

Thu Dec 11 11:00 AM -- 03:00 PM (PST) @ Level 2, room 210D

Distributed machine learning has typically been approached from a data parallel perspective, where big data are partitioned to multiple workers and an algorithm is executed concurrently over different data subsets under various synchronization schemes to ensure speed-up and/or correctness. A sibling problem that has received relatively less attention is how to ensure efficient and correct model parallel execution of ML algorithms, where parameters of an ML program are partitioned to different workers and undergone concurrent iterative updates. We argue that model and data parallelisms impose rather different challenges for system design, algorithmic adjustment, and theoretical analysis. In this paper, we develop a system for model-parallelism, STRADS, that provides a programming abstraction for scheduling parameter updates by discovering and leveraging changing structural properties of ML programs. STRADS enables a flexible tradeoff between scheduling efficiency and fidelity to intrinsic dependencies within the models, and improves memory efficiency of distributed ML. We demonstrate the efficacy of model-parallel algorithms implemented on STRADS versus popular implementations for topic modeling, matrix factorization, and Lasso.

Author Information

Seunghak Lee (Carnegie Mellon University)
Jin Kyu Kim (Carnegie Mellon University)
Xun Zheng (Carnegie Mellon University)
Qirong Ho (Petuum, Inc.)
Garth Gibson (Vector Institute and CMU)
Eric Xing (Petuum Inc. / Carnegie Mellon University)

More from the Same Authors