Title
Performance Analysis of CFD Application Cart3D Using MPInside and Performance Monitor Unit Data on Nehalem and Westmere Based Supercomputers
Abstract
Cart3D is a computational fluid dynamics (CFD) application aimed at conceptual and preliminary design of aerospace vehicles with complex geometries. It is widely used by design engineers at NASA, Department of Defense and aerospace companies in the USA. We present detailed performance analysis of Cart3D using two tools SGI MPInside and op_scope that collects hardware counter data from Intel Performance Monitoring Unit (PMU) on supercomputers based on Nehalem micro-architecture. Using these tools, we have done dynamic profiling of Cart3D (compute time, communication time and I/O time), along with dynamic profiling of MPI functions (MPI_Sendrecv, MPI_Bcast, MPI_Isend, MPI_Irecv, MPI_Allreduce, MPI_Barrier, etc.) with respect to message size of each rank and time consumed by each function. MPI communication is further analyzed by studying the performance of MPI functions used in this application as a function of message size and number of cores. Using these tools we have also studied efficiency of the processor to measure its effective utilization, efficiency of the floating-point units, percentage of vectorization and percentage of data coming from L2 cache, L3 cache, and main memory. This study was performed on two computing sub-systems based on quad-core Nehalem-EP and hex-core West mere-EP processors that are part of Pleiades an SGI Altix ICE at NASA Ames Research Center.
Year
DOI
Venue
2011
10.1109/HPCC.2011.50
HPCC
Keywords
Field
DocType
mpi_sendrecv,l3 cache,sgi mpinside,usa,aerospace vehicles,nasa ames research center,sgi altix ice,cfd application,cfd application cart3d,nehalem micro-architecture,communication time,nehalem based supercomputers,mpi_irecv,hardware counter data,computational fluid dynamics,message size,aerospace,main memory,mpi function,cart3d,mpi_barrier,l2 cache,quadcore nehalem-ep processors,mpi_bcast,computational fluid dynamics application,aircraft,simultaneous multi threading (smt),hyper-threading,benchmarking,westmere based supercomputers,performance evaluation,multiprocessing systems,mainframes,processor efficiency,mpi functions,intel performance monitoring unit,floating-point units,pleiades,complex geometries,aerospace companies,intel nehalem micro-architecture,o time,mpi_allreduce,performance monitor unit data,mpi communication,department of defense,op_scope,defence industry,pmu,message passing,hex-core west mere-ep processors,computer architecture,tools sgi mpinside,performance analysis,parallel machines,mpi_isend,compute time communication time io time,dynamic profiling,design engineers,hyper threading
Aerospace,CPU cache,Computer science,Profiling (computer programming),Parallel computing,Vectorization (mathematics),Real-time computing,Monitor unit,Hyper-threading,Computational fluid dynamics,Message passing,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-0-7695-4538-7
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
subhash saini156147.57
Piyush Mehrotra2619139.52
Kenichi Taylor3181.50
Michael Aftosmis400.68
Rupak Biswas5922109.66