Title
Tuning SPMD Applications in Order to Increase Performability
Abstract
When running parallel applications on HPC clusters usually the prior objectives are: almost linear speedup, efficient resources utilization, scalability and successful completion. Hence, applications executions are now facing a multiobjective problem which is focused on improving Performance while giving Fault Tolerance (FT) support, this combination is defined as Performability. The performance of Single Program Multiple Data (SPMD) applications written using a message-passing library (MPI) may be seriously affected, when applying a message logging approach, because they are tightly coupled and have a huge amount of communications. In this sense, we have proposed a novel method for SPMD applications which allows us to obtain the maximum speedup under a defined efficiency threshold considering the impact of a fault tolerance strategy when executing on multicore clusters. This method is based on four phases: characterization, tile distribution, mapping and scheduling. The idea of this method is to manage the effects of the added overhead of FT techniques, which seriously affect the MPI application performance. In this sense, our method manages the overheads of message logging by overlapping them with computation. Then, the main objective of this method is to determine the approximate number of computational cores and the ideal number of tiles, which permit us to obtain a suitable balance between speedup, efficiency and dependability. The obtained results illustrate that we can find the maximum speedup under a defined efficiency using a FT strategy with a small error rate of 5.4% for the worst case. By using our method, we can also determine the ideal problem size for a given number of computational cores (weak scalability) using FT with an error of around 5.8%. Results also show that our message logging approach could be tuned to introduce a constant overhead percentage when scaling the size of the problem.
Year
DOI
Venue
2013
10.1109/TrustCom.2013.141
Trust, Security and Privacy in Computing and Communications
Keywords
Field
DocType
linear speedup,novel method,ideal number,approximate number,ft strategy,applications execution,tuning spmd applications,maximum speedup,efficiency threshold,ft technique,computational core,fault tolerance,spmd,message passing,parallel processing,software fault tolerance,resource utilization,resource allocation,multicore processing,efficiency,scalability
SPMD,Dependability,Computer science,Parallel computing,Software fault tolerance,Fault tolerance,Multi-core processor,Message passing,Scalability,Speedup
Conference
ISSN
Citations 
PageRank 
2324-898X
0
0.34
References 
Authors
11
4
Name
Order
Citations
PageRank
Hugo Meyer1143.08
Ronal Muresano2113.56
Dolores Rexachs319543.20
Emilio Luque41097176.18