The Hierarchical Memory Machine Model for GPUs - Citegraph

Paper Info

Title
The Hierarchical Memory Machine Model for GPUs

Abstract
The Discrete Memory Machine (DMM) and the Unified Memory Machine (UMM) are theoretical parallel computing models that capture the essence of the shared memory access and the global memory access of GPUs. The main contribution of this paper is to introduce the Hierarchical Memory Machine (HMM), which consists of multiple DMMs and a single UMM. The HMM is a more practical parallel computing model which reflects the architecture of current GPUs. We present several fundamental algorithms on the HMM. First, we show that the sum of n numbers can be computed in O(n/w + nl/p + l + log n) time units using p threads on the HMM with width ω and latency l, and prove that this computing time is optimal. We also show that the direct convolution of m and m + n - 1 numbers can be done in O(n/w + mn/dw + nl/p + l+ log m) time units using p threads on the HMM with d DMMs, width ω, and latency l. Finally, we prove that our implementation of the direct convolution is time optimal.

Year	DOI	Venue
2013	10.1109/IPDPSW.2013.17	IPDPS Workshops
Keywords	Field	DocType
computational complexity,graphics processing units,shared memory systems,DMM,GPU,HMM,UMM,computing time,direct convolution,discrete memory machine,global memory access,graphics processing unit,hierarchical memory machine model,parallel computing models,shared memory access,unified memory machine,CUDA,GPU,convolution,memory machine models,parallel computing models	Binary logarithm,Shared memory,Convolution,Computer science,CUDA,Parallel computing,Thread (computing),Memory model,Hidden Markov model,Computational complexity theory	Conference
Citations	PageRank	References
3	0.44	0
Authors
1

Authors (1 rows)

Cited by (3 rows)

References (0 rows)

Name	Order	Citations	PageRank
Koji Nakano	1	1165	118.13

1