Title
Autotuning GPU-Accelerated QAP Solvers for Power and Performance
Abstract
The Quadratic Assignment Problem (QAP) is an important combinatorial optimization problem with applications in many areas including operations research and chip design. QAP is known to be NP-hard and requires heuristic approaches for most real data sets. Although many algorithms have been proposed for solving QAPs, few have attempted to exploit the fine-grain data parallelism available on accelerator architectures and none have considered aspects of energy efficiency. In this paper, we present GPU-accelerated implementations of two QAP algorithms. We parameterize each algorithmic variant along several dimensions including thread granularity, occupancy, shared memory usage and register pressure. We develop an autotuning framework that explores this space of execution parameters and yields CUDA binaries that are efficient in terms of both performance and power consumption. We demonstrate the effectiveness of our tuning framework on an Nvidia Tesla K20c. On a series of experiments on the well-known QAPLIB data sets, our autotuned solutions run at least an order-of-magnitude faster than baseline implementations. The experimental results also provide key insight into the performance characteristics of accelerated QAP solvers. In particular, the results reveal that both algorithmic choice and the shape of the input data sets are key factors in finding an efficient implementation.
Year
DOI
Venue
2015
10.1109/HPCC-CSS-ICESS.2015.121
HPCC/CSS/ICESS
Keywords
Field
DocType
autotuning, energy efficiency, parallel performance
Kernel (linear algebra),Heuristic,Shared memory,CUDA,Quadratic assignment problem,Computer science,Instruction set,Parallel computing,Real-time computing,Data parallelism,Sparse matrix,Distributed computing
Conference
ISSN
Citations 
PageRank 
2576-3504
0
0.34
References 
Authors
15
3
Name
Order
Citations
PageRank
Abhilash Chaparala151.12
Clara Novoa2715.34
Apan Qasem312716.34