Title
MPIActor - A Multicore-Architecture Adaptive and Thread-Based MPI Program Accelerator
Abstract
Improving MPI foundational software to suit multicore systems is a key issue for developing effective parallel software on high performance communication domain. Towards this issue, in this paper, we propose a novel technique, called MPI Accelerator or MPIActor in short, which is a transparent middleware to enhance conventional MPI libraries. The main idea is to optimize MPI routines for multicore systems by adopting threaded MPI mechanism and multicore architecture aware collectives in MPIActor. With the join of MPIActor, on one hand, all MPI processes in each node are mapped to several threads in one process. As a result, the overhead of intra-node point-to-point communications can greatly decrease. On the other hand, the collective routines are implemented by the cooperation of individual intra - and inter-node collective subroutines, and the intra-node collective subroutines can be further optimized by multicore architecture aware collective algorithms. Based on above idea, a framework involving an MPI_Reduce routine and a set of point-to-point communication routines has been implemented and evaluated on a 256 cores Nehalem platform. When compared to the performance of MVAPICH2, the final experimental results show that the performance by MPIActor can be significantly improved whatever by using OSU_LATENCY benchmark for point-to-point communications or IMB Reduce benchmark for reduction collectives. Especially, the performance results of using OSU_LATENCY benchmark even can be improved up to 321%.
Year
DOI
Venue
2010
10.1109/HPCC.2010.89
HPCC
Keywords
Field
DocType
mpi routine,thread-based mpi program accelerator,aware collective,multicore-architecture adaptive,mpi accelerator,conventional mpi library,threaded mpi mechanism,multicore architecture,improving mpi foundational software,multicore system,aware collective algorithm,osu_latency benchmark,algorithm design and analysis,instruction sets,point to point,middleware,multicore processing,multi threading,subroutines,benchmark testing,message passing,communication
Middleware,Multithreading,Computer architecture,Computer science,Instruction set,Parallel computing,Thread (computing),Software,Multi-core processor,Message passing,Benchmark (computing),Distributed computing
Conference
Citations 
PageRank 
References 
1
0.36
13
Authors
3
Name
Order
Citations
PageRank
Zhiqiang Liu1124.68
Kaijun Ren213223.89
Junqiang Song318526.86