Abstract | ||
---|---|---|
We consider distributed gradient descent in the presence of stragglers. Recent work on gradient coding and approximate gradient coding have shown how to add redundancy in distributed gradient descent to guarantee convergence even if some workers are slow or non-responsive. In this work we propose a new type of approximate gradient coding which we call Stochastic Gradient Coding (SGC). The idea of SGC is very simple: we distribute data points redundantly to workers according to a good combinatorial design. We prove that the convergence rate of SGC mirrors that of batched Stochastic Gradient Descent (SGD) for the l
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub>
loss function, and show how the convergence rate can improve with the redundancy. We show empirically that SGC requires a small amount of redundancy to handle a large number of stragglers and that it can outperform existing approximate gradient codes when the number of stragglers is large. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ITW44776.2019.8989328 | 2019 IEEE Information Theory Workshop (ITW) |
Keywords | Field | DocType |
approximate gradient coding,convergence rate,stochastic gradient descent,flexible straggler mitigation,stochastic gradient coding,distributed gradient descent-based machine learning algorithm,parallelizable gradient descent | Convergence (routing),Data point,Gradient descent,Stochastic gradient descent,Computer science,Algorithm,Theoretical computer science,Coding (social sciences),Redundancy (engineering),Combinatorial design,Rate of convergence | Conference |
ISSN | ISBN | Citations |
2475-420X | 978-1-5386-6901-3 | 1 |
PageRank | References | Authors |
0.47 | 5 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rawad Bitar | 1 | 1 | 0.47 |
Mary Wootters | 2 | 172 | 25.99 |
Salim El Rouayheb | 3 | 13 | 3.43 |