GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server. - Citegraph

Paper Info

Title
GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server.

Abstract
Large-scale deep learning requires huge computational resources to train a multi-layer neural network. Recent systems propose using 100s to 1000s of machines to train networks with tens of layers and billions of connections. While the computation involved can be done more efficiently on GPUs than on more traditional CPU cores, training such networks on a single GPU is too slow and training on distributed GPUs can be inefficient, due to data movement overheads, GPU stalls, and limited GPU memory. This paper describes a new parameter server, called GeePS, that supports scalable deep learning across GPUs distributed among multiple machines, overcoming these obstacles. We show that GeePS enables a state-of-the-art single-node GPU implementation to scale well, such as to 13 times the number of training images processed per second on 16 machines (relative to the original optimized single-node code). Moreover, GeePS achieves a higher training throughput with just four GPU machines than that a state-of-the-art CPU-only system achieves with 108 machines.

Year	DOI	Venue
2016	10.1145/2901318.2901323	EuroSys
Field	DocType	Citations
Virtual machine,CUDA,Computer science,Parallel computing,Caffè,Real-time computing,Artificial intelligence,Throughput,Deep learning,Artificial neural network,Multi-core processor,Scalability	Conference	75
PageRank	References	Authors
2.09	24	5

Authors (5 rows)

Cited by (75 rows)

References (24 rows)

Name	Order	Citations	PageRank
Henggang Cui	1	307	11.66
Hao Zhang	2	276	13.13
Gregory R. Ganger	3	4560	383.16
Phillip B. Gibbons	4	6863	624.14
Bo Xing	5	7332	471.43

1