`

Timezone: »

 
Poster
Coded Sequential Matrix Multiplication For Straggler Mitigation
Nikhil Krishnan Muralee Krishnan · Seyederfan Hosseini · Ashish Khisti

Thu Dec 10 09:00 AM -- 11:00 AM (PST) @ Poster Session 5 #1449
In this work, we consider a sequence of $J$ matrix multiplication jobs which needs to be distributed by a master across multiple worker nodes. For $i\in \{1,2,\ldots,J\}$, job-$i$ begins in round-$i$ and has to be completed by round-$(i+T)$. Previous works consider only the special case of $T=0$ and focus on coding across workers. We propose here two schemes with $T>0$, which feature coding across workers as well as the dimension of time. Our first scheme is a modification of the polynomial coding scheme introduced by Yu et al. and places no assumptions on the straggler model. Exploitation of the temporal dimension helps the scheme handle a larger set of straggler patterns than the polynomial coding scheme, for a given computational load per worker per round. The second scheme assumes a particular straggler model to further improve performance (in terms of encoding/decoding complexity). We develop theoretical results establishing (i) optimality of our proposed schemes for a certain class of straggler patterns and (ii) improved performance for the case of i.i.d. stragglers. These are further validated by experiments, where we implement our schemes to train neural networks.

Author Information

Nikhil Krishnan Muralee Krishnan (University of Toronto)
Erfan Hosseini (University of Toronto)
Ashish Khisti (University of Toronto)

More from the Same Authors