Title | ||
---|---|---|
The impact of hyper-threading on processor resource utilization in production applications |
Abstract | ||
---|---|---|
Intel provides Hyper-Threading (HT) in processors based on its Pentium and Nehalem micro-architecture such as the Westmere-EP. HT enables two threads to execute on each core in order to hide latencies related to data access. These two threads can execute simultaneously, filling unused stages in the functional unit pipelines. To aid better understanding of HT-related issues, we collect Performance Monitoring Unit (PMU) data (instructions retired; unhalted core cycles; L2 and L3 cache hits and misses; vector and scalar floating-point operations, etc.). We then use the PMU data to calculate a new metric of efficiency in order to quantify processor resource utilization and make comparisons of that utilization between single-threading (ST) and HT modes. We also study performance gain using unhalted core cycles, code efficiency of using vector units of the processor, and the impact of HT mode on various shared resources like L2 and L3 cache. Results using four full-scale, production-quality scientific applications from computational fluid dynamics (CFD) used by NASA scientists indicate that HT generally improves processor resource utilization efficiency, but does not necessarily translate into overall application performance gain. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/HiPC.2011.6152743 | HiPC |
Keywords | Field | DocType |
pmu data,l3 cache,processor resource utilization,overall application performance gain,code efficiency,unhalted core cycle,l3 cache hit,ht mode,production application,processor resource utilization efficiency,data access,computational fluid dynamics,hardware,resource allocation,instruction sets,bandwidth,benchmarking,functional unit,radiation detector,resource utilization,floating point,multi threading,information retrieval,radiation detectors | Multithreading,CPU cache,Instruction set,Computer science,Hyper-threading,Distributed computing,Parallel computing,Thread (computing),Resource allocation,Pentium,Data access,Operating system,Embedded system | Conference |
Citations | PageRank | References |
13 | 0.70 | 4 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
subhash saini | 1 | 561 | 47.57 |
Haoqiang Jin | 2 | 284 | 31.77 |
Robert Hood | 3 | 87 | 10.42 |
David Barker | 4 | 13 | 0.70 |
Piyush Mehrotra | 5 | 619 | 139.52 |
Rupak Biswas | 6 | 922 | 109.66 |