T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations | 7 | 0.47 | 2019 |
A Synergetic Approach to Throughput Computing on x86-Based Multicore Desktops | 9 | 0.54 | 2011 |
Concurrent Collections | 26 | 1.20 | 2010 |
Why Intel is designing multi-core processors | 5 | 0.47 | 2006 |
Pin: building customized program analysis tools with dynamic instrumentation | 1782 | 71.80 | 2005 |
Ispike: A Post-link Optimizer for the Intel®Itanium®Architecture | 23 | 5.56 | 2004 |
Profile-guided post-link stride prefetching | 18 | 1.51 | 2002 |
Code layout optimizations for transaction processing workloads | 34 | 1.57 | 2001 |
Design and Analysis of Profile-Based Optimization in Compaq's Compilation Tools for Alpha | 11 | 0.68 | 2000 |
Optimizing alpha executables on Windows NT with spike | 29 | 7.23 | 1998 |
Hot cold optimization of large Windows/NT applications | 25 | 11.28 | 1996 |
Avoidance and suppression of compensation code in a trace scheduling compiler | 15 | 1.57 | 1994 |
The multiflow trace scheduling compiler | 156 | 31.89 | 1993 |
Carrier arrays: an idiom-preserving extension to APL | 0 | 0.34 | 1981 |