Abstract | ||
---|---|---|
The Hungarian algorithm solves the linear assignment problem in polynomial time. A GPU/CUDA implementation of this algorithm is proposed. GPUs are massive parallel machines. In this implementation, the alternating path search phase of the algorithm is distributed by several blocks in a way to minimize global device synchronization. This phase is very important and has a big contribution to the execution time. Other advanced features also implemented are: parallel graph traversal; the parallel detection of multiple alternating paths in a single iteration; a simplified and fast matrix compression that stores the zeros of the slack matrix, resulting in very fast graph traversal; highly optimized reductions for the initial slack matrix calculation and update. This results in a fast implementation for moderate size problems. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1016/j.jpdc.2019.03.014 | Journal of Parallel and Distributed Computing |
Keywords | Field | DocType |
Hungarian algorithm,Linear assignment problem,GPU | Hungarian algorithm,Synchronization,Graph traversal,Computer science,Matrix (mathematics),CUDA,Parallel computing,Assignment problem,Execution time,Time complexity | Journal |
Volume | ISSN | Citations |
130 | 0743-7315 | 1 |
PageRank | References | Authors |
0.36 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Paulo Alexandre Crisóstomo Lopes | 1 | 7 | 4.38 |
Satyendra Singh Yadav | 2 | 1 | 0.36 |
Aleksandar Ilic | 3 | 283 | 35.40 |
Sarat Kumar Patra | 4 | 27 | 10.08 |