A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps - Citegraph

Paper Info

Title
A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps

Abstract
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources are often overwhelmingly idle, waiting for data from memory to arrive. This paper introduces the Core-Assisted Bottleneck Acceleration (CABA) framework that employs idle on-chip resources to alleviate different bottlenecks in GPU execution. CABA provides flexible mechanisms to automatically generate \"assist warps\" that execute on GPU cores to perform specific tasks that can improve GPU performance and efficiency. CABA enables the use of idle computational units and pipelines to alleviate the memory bandwidth bottleneck, e.g., by using assist warps to perform data compression to transfer less data from memory. Conversely, the same framework can be employed to handle cases where the GPU is bottlenecked by the available computational units, in which case the memory pipelines are idle and can be used by CABA to speed up computation, e.g., by performing memoization using assist warps. We provide a comprehensive design and evaluation of CABA to perform effective and flexible data compression in the GPU memory hierarchy to alleviate the memory bandwidth bottleneck. Our extensive evaluations show that CABA, when used to implement data compression, provides an average performance improvement of 41.7% (as high as 2.6X) across a variety of memory-bandwidth-sensitive GPGPU applications.

Year	DOI	Venue
2015	10.1145/2749469.2750399	International Symposium on Computer Architecture
Keywords	Field	DocType
core-assisted bottleneck acceleration,GPUs,flexible data compression,graphics processing units,CPUs,concurrent execution,CABA,idle on-chip resources,CPU execution,memory bandwidth bottleneck,memory pipelines,CPU memory hierarchy,GPGPU applications	Bottleneck,Memory hierarchy,Memory bandwidth,Computer science,Instruction set,Parallel computing,Real-time computing,General-purpose computing on graphics processing units,Memoization,Data compression,Speedup	Conference
Volume	Issue	ISSN
43	3S	0163-5964
Citations	PageRank	References
43	0.75	66
Authors
9

Authors (9 rows)

Cited by (43 rows)

References (66 rows)

Name	Order	Citations	PageRank
Nandita Vijaykumar	1	146	7.55
Gennady Pekhimenko	2	706	28.75
Adwait Jog	3	568	23.32
Abhishek Bhowmick	4	43	0.75
Rachata Ausavarungnirun	5	780	29.88
Chita R. Das	6	1046	45.21
Mahmut T. Kandemir	7	7371	568.54
Todd C. Mowry	8	3021	253.75
Onur Mutlu	9	9446	357.40

1