Name
Affiliation
Papers
STEVEN K. REINHARDT
Advanced Micro Devices, Inc., Bellevue, Wash
57
Collaborators
Citations 
PageRank 
148
3885
226.69
Referers 
Referees 
References 
7192
1794
807
Search Limit
1001000
Title
Citations
PageRank
Year
AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing90.502020
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.00.342020
Inside Project Brainwave's Cloud-Scale, Real-Time AI Processor.00.342019
ComP-net: command processor networking for efficient intra-kernel communications on GPUs00.342018
A Configurable Cloud-Scale DNN Processor for Real-Time AI.311.252018
Generic System Calls for GPUs.30.372018
Design and Analysis of an APU for Exascale Computing110.562017
If You Build It, Will They Come?30.362017
GPU triggered networking for intra-kernel communications30.402017
Programming GPGPU Graph Applications with Linear Algebra Building Blocks.50.532017
Gravel: fine-grain GPU-initiated network messages10.362017
Extended task queuing: active messages for heterogeneous systems.30.392016
Graph Coloring on the GPU and Some Techniques to Improve Load Imbalance10.402015
Achieving Exascale Capabilities through Heterogeneous Computing140.612015
BelRed: Constructing GPGPU graph applications with software building blocks20.432014
Heterogeneous-race-free memory models391.132014
QuickRelease: A throughput-oriented approach to release consistency on GPUs280.942014
Fine-grain task aggregation and coordination on GPUs140.632014
Pannotia: Understanding irregular GPGPU graph applications641.682013
Heterogeneous system coherence for integrated CPU-GPU systems471.402013
The gem5 simulator85324.922011
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication250.942011
Server Designs for Warehouse-Computing Environments40.822009
End-to-end performance forecasting: finding bottlenecks before they happen60.532009
Full-System Critical Path Analysis70.562008
Analysis of hardware prefetching across virtual page boundaries40.462007
The M5 Simulator: Modeling Networked Systems45826.592006
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource1076.262006
A unified compressed memory hierarchy502.472005
Exploring the cache design space for large scale CMPs421.712005
Performance Analysis of System Overheads in TCP/IP Workloads191.192005
The soft error problem: an architectural perspective2028.522005
How to Fake 1000 Registers170.712005
Reducing the soft-error rate of a high-performance microprocessor90.672004
Cache Scrubbing in Microprocessors: Myth or Necessity?674.922004
A compressed memory hierarchy using an indirect index cache170.872004
The Impact of Resource Partitioning on SMT Processors673.002003
Guided Region Prefetching: A Cooperative Hardware/Software Approach.00.342003
Measuring Architectural Vulnerability Factors171.092003
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor46024.572003
A scalable instruction queue design using dependence chains642.442002
Detailed design and evaluation of redundant multi-threading alternatives23017.212002
Designing a modern memory hierarchy with hardware prefetching221.422001
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design1038.242001
Integrating hardware and software concepts in a microprocessor-based system design lab10.732000
A fully associative software-managed cache design798.102000
Transient fault detection via simultaneous multithreading31718.152000
Hardware Support for Flexible Distributed Shared Memory10.411998
Retrospective: tempest and typhoon: user-level shared memory10.351998
Decoupled hardware support for distributed shared memory412.531996
  • 1
  • 2