CSMT: Simultaneous Multithreading for Clustered VLIW Processors | 1 | 0.36 | 2010 |
Thread Merging Schemes for Multithreaded Clustered VLIW Processors | 0 | 0.34 | 2009 |
Hybrid multithreading for VLIW processors | 2 | 0.40 | 2009 |
Power-Efficient Vliw Design Using Clustering And Widening | 1 | 0.44 | 2008 |
Merge Logic for Clustered Multithreaded VLIW Processors | 4 | 0.41 | 2007 |
Cluster-Level Simultaneous Multithreading For Vliw Processors | 4 | 0.42 | 2007 |
Silicon Compaction/Defragmentation for Partial Runtime Reconfiguration | 0 | 0.34 | 2007 |
Near-optimal padding for removing conflict misses | 8 | 0.63 | 2005 |
An accurate cost model for guiding data locality transformations | 3 | 0.41 | 2005 |
Software and hardware techniques to optimize register file utilization in VLIW architectures | 13 | 0.68 | 2004 |
Future ILP processors | 0 | 0.34 | 2004 |
Out-of-Order Commit Processors | 85 | 3.28 | 2004 |
A case for resource-conscious out-of-order processors: towards kilo-instruction in-flight processors | 4 | 0.60 | 2004 |
A fast and accurate framework to analyze and optimize cache memory behavior | 23 | 1.22 | 2004 |
Register Constrained Modulo Scheduling | 11 | 0.63 | 2004 |
High-performance and low-power VLIW cores for numerical computations | 1 | 0.36 | 2004 |
A case for resource-conscious out-of-order processors | 10 | 0.70 | 2003 |
Kilo-instruction Processors | 10 | 0.65 | 2003 |
Optimizing program locality through CMEs and GAs | 15 | 1.00 | 2003 |
Hierarchical Clustered Register File Organization for VLIW Processors | 10 | 0.70 | 2003 |
Near-optimal loop tiling by means of cache miss equations and genetic algorithms | 16 | 0.99 | 2002 |
Reduced code size modulo scheduling in the absence of hardware support | 6 | 0.58 | 2002 |
A comparative study of modulo scheduling techniques | 26 | 1.03 | 2002 |
Cost-Conscious Strategies to Increase Performance of Numerical Programs on Aggressive VLIW Architectures | 2 | 0.39 | 2001 |
MIRS: modulo scheduling with integrated register spilling | 9 | 0.55 | 2001 |
Lifetime-sensitive modulo scheduling in a production environment | 28 | 1.78 | 2001 |
Modulo scheduling with integrated register spilling for clustered VLIW architectures | 33 | 1.13 | 2001 |
Two-level hierarchical register file organization for VLIW processors | 47 | 3.58 | 2000 |
A Fast and Accurate Approach to Analyze Cache Memory Behavior (Research Note) | 8 | 0.73 | 2000 |
Improved spill code generation for software pipelined loops | 13 | 0.81 | 2000 |
Optimizing cache miss equations polyhedra | 3 | 0.53 | 2000 |
Impact on Performance of Fused Multiply-Add Units in Aggressive VLIW Architectures | 1 | 0.35 | 1999 |
Distributed Modulo Scheduling | 26 | 2.51 | 1999 |
Widening resources: a cost-effective technique for aggressive ILP architectures | 7 | 0.49 | 1998 |
Quantitative Evaluation of Register Pressure on Software Pipelined Loops | 12 | 0.87 | 1998 |
Resource widening versus replication: limits and performance-cost trade-off | 4 | 0.54 | 1998 |
Modulo Scheduling with Reduced Register Pressure | 9 | 0.89 | 1998 |
Partitioned schedules for clustered VLIW architectures | 10 | 0.93 | 1998 |
Allocating Lifetimes to Queues in Software Pipelined Architectures | 4 | 0.68 | 1997 |
Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs | 5 | 0.48 | 1997 |
Heuristics for register-constrained software pipelining | 25 | 1.15 | 1996 |
Swing Modulo Scheduling: A Lifetime-Sensitive Approach | 66 | 3.11 | 1996 |
Using Sacks to Organize Registers in VLIW Machines | 9 | 1.30 | 1994 |