Title
Using Per-Loop CPU Clock Modulation for Energy Efficiency in OpenMP Applications
Abstract
As the HPC community moves into the exascale computing era, application energy is becoming as large of a concern as performance. Optimizing for energy will be essential in the effort to overcome the limited power envelope. Existing efforts to optimize energy in applications employ Dynamic Frequency and Voltage Scaling (DVFS) to maximize energy savings in less compute-intensive regions or non-critical execution paths. However, we found that DVFS has high power state switching overhead, preventing its use when a more fine-grained technique is necessary. In this work, we take advantage of the low transition overhead of CPU clock modulation and apply it to fine-grained Open MP parallel loops. The energy behavior of Open MP parallel regions is first characterized by changing the effective frequency using clock modulation. The clock modulation setting that achieves the best energy efficiency is then determined for each region. Finally, different CPU clock modulation settings are applied to the different loops within the same application. The resulting multi-frequency execution of Open MP applications achieves better energy-delay trade-off than any single frequency setting. In the best case scenario, the multi-frequency approach achieved 8.6% energy savings with less than 1.5% execution time increase. Concurrency throttling (i.e., Reducing the number of hardware threads used by an application) saves more energy and can be combined with CPU clock modulation. Using both, we see savings of 21% energy and improvement of energy-delay product (EDP) by 16%.
Year
DOI
Venue
2015
10.1109/ICPP.2015.72
2015 44th International Conference on Parallel Processing
Keywords
Field
DocType
per-loop CPU clock modulation,HPC community,application energy,energy optimization,power envelope,dynamic frequency-and-voltage scaling,DVFS,energy saving,compute-intensive regions,noncritical execution paths,power state switching overhead,transition overhead,fine-grained OpenMP parallel loops,energy behavior,OpenMP parallel regions,energy efficiency,multifrequency execution,energy-delay,energy savings,execution time,concurrency throttling,hardware threads,energy-delay product improvement,EDP improvement
Exascale computing,Efficient energy use,Concurrency,Computer science,Voltage,Parallel computing,Thread (computing),Modulation,CPU multiplier,Clock rate,Embedded system,Distributed computing
Conference
ISSN
Citations 
PageRank 
0190-3918
8
0.53
References 
Authors
22
4
Name
Order
Citations
PageRank
Wei Wang 0082180.53
Allan Porterfield254782.18
John Cavazos358426.93
Sridutt Bhalachandra4664.82