Title | ||
---|---|---|
ProfileMe: hardware support for instruction-level profiling on out-of-order processors |
Abstract | ||
---|---|---|
Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor's performance monitoring hardware is an effective, unobtrusive way to obtain detailed profiles. Unfortunately, existing hardware simply counts events, such as cache misses and branch mispredictions, and cannot accurately attribute these events to instructions, especially on out-of-order machines. We propose an alternative approach, called ProfileMe, that samples instructions. As a sampled instruction moves through the processor pipeline, a detailed record of all interesting events and pipeline stage latencies is collected. ProfileMe also support paired sampling, which captures information about the interactions between concurrent instructions, revealing information about useful concurrency and the utilization of various pipeline stages while an instruction is in flight. We describe an inexpensive hardware implementation of ProfileMe, outline a variety of software techniques to extract useful profile information from the hardware, and explain several ways in which this information can provide valuable feedback for programmers and optimizers. |
Year | DOI | Venue |
---|---|---|
1997 | 10.1109/MICRO.1997.645821 | MICRO |
Keywords | Field | DocType |
useful profile information,hardware support,instruction-level profiling,inexpensive hardware implementation,revealing information,various pipeline stage,detailed profile,concurrent instruction,processor pipeline,out-of-order processor,performance monitoring hardware,existing hardware,pipeline stage latency,concurrent computing,feedback,out of order,computer architecture,hardware,instruction sets,sampling methods,microprogramming,data mining,pipelines | Microcode,Concurrency,Profiling (computer programming),Cache,Instruction set,Computer science,Real-time computing,Software,Computer hardware,Out-of-order execution,Computer architecture,Parallel computing,Concurrent computing | Conference |
ISSN | ISBN | Citations |
1072-4451 | 0-8186-7977-8 | 140 |
PageRank | References | Authors |
19.96 | 10 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jeffrey Dean | 1 | 140 | 19.96 |
James E. Hicks | 2 | 220 | 30.51 |
Carl Waldspurger | 3 | 2003 | 336.72 |
William E. Weihl | 4 | 2614 | 903.11 |
George Chrysos | 5 | 234 | 23.78 |