GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism - Citegraph

Paper Info

Title
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Abstract
Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has required developing special algorithms or infrastructure. These solutions are often architecture-specific and do not transfer to other tasks. To address the need for efficient and task-independent model parallelism, we introduce GPipe, a pipeline parallelism library that allows scaling any network that can be expressed as a sequence of layers. By pipelining different sub-sequences of layers on separate accelerators, GPipe provides the flexibility of scaling a variety of different networks to gigantic sizes efficiently. Moreover, GPipe utilizes a novel batch-splitting pipelining algorithm, resulting in almost linear speedup when a model is partitioned across multiple accelerators. We demonstrate the advantages of GPipe by training large-scale neural networks on two different tasks with distinct network architectures: (i) Image Classification: We train a 557-million-parameter AmoebaNet model and attain a top-1 accuracy of 84.4% on ImageNet-2012, (ii) Multilingual Neural Machine Translation: We train a single 6-billion-parameter, 128-layer Transformer model on a corpus spanning over 100 languages and achieve better quality than all bilingual models.

Year	Field	DocType
2019	Computer science,Artificial intelligence,Artificial neural network,Machine learning	Conference
Citations	PageRank	References
7	0.42	0
Authors
11

Authors (11 rows)

Cited by (7 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yanping Huang	1	210	9.80
Cheng, Youlong	2	19	1.28
Ankur Bapna	3	36	8.45
Orhan Firat	4	281	29.13
Dehao Chen	5	17	1.57
Xu Chen	6	30	5.73
HyoukJoong Lee	7	414	17.71
Jiquan Ngiam	8	297	19.56
Quoc V. Le	9	8501	366.59
Yonghui Wu	10	1065	72.78
Zhifeng Chen	11	2747	106.75

1