Steven K. Reinhardt

Author Info

Name	Affiliation	Papers
STEVEN K. REINHARDT	Advanced Micro Devices, Inc., Bellevue, Wash	57
Collaborators	Citations	PageRank
148	3885	226.69
Referers	Referees	References
7192	1794	807

Search Limit

1001000

Publications (57 rows)

Collaborators (100 rows)

Referers (100 rows)

Referees (100 rows)

Title	Citations	PageRank	Year
AWB-GCN: A Graph Convolutional Network Accelerator with Runtime Workload Rebalancing	9	0.50	2020
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.	0	0.34	2020
Inside Project Brainwave's Cloud-Scale, Real-Time AI Processor.	0	0.34	2019
ComP-net: command processor networking for efficient intra-kernel communications on GPUs	0	0.34	2018
A Configurable Cloud-Scale DNN Processor for Real-Time AI.	31	1.25	2018
Generic System Calls for GPUs.	3	0.37	2018
Design and Analysis of an APU for Exascale Computing	11	0.56	2017
If You Build It, Will They Come?	3	0.36	2017
GPU triggered networking for intra-kernel communications	3	0.40	2017
Programming GPGPU Graph Applications with Linear Algebra Building Blocks.	5	0.53	2017
Gravel: fine-grain GPU-initiated network messages	1	0.36	2017
Extended task queuing: active messages for heterogeneous systems.	3	0.39	2016
Graph Coloring on the GPU and Some Techniques to Improve Load Imbalance	1	0.40	2015
Achieving Exascale Capabilities through Heterogeneous Computing	14	0.61	2015
BelRed: Constructing GPGPU graph applications with software building blocks	2	0.43	2014
Heterogeneous-race-free memory models	39	1.13	2014
QuickRelease: A throughput-oriented approach to release consistency on GPUs	28	0.94	2014
Fine-grain task aggregation and coordination on GPUs	14	0.63	2014
Pannotia: Understanding irregular GPGPU graph applications	64	1.68	2013
Heterogeneous system coherence for integrated CPU-GPU systems	47	1.40	2013
The gem5 simulator	853	24.92	2011
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication	25	0.94	2011
Server Designs for Warehouse-Computing Environments	4	0.82	2009
End-to-end performance forecasting: finding bottlenecks before they happen	6	0.53	2009
Full-System Critical Path Analysis	7	0.56	2008
Analysis of hardware prefetching across virtual page boundaries	4	0.46	2007
The M5 Simulator: Modeling Networked Systems	458	26.59	2006
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource	107	6.26	2006
A unified compressed memory hierarchy	50	2.47	2005
Exploring the cache design space for large scale CMPs	42	1.71	2005
Performance Analysis of System Overheads in TCP/IP Workloads	19	1.19	2005
The soft error problem: an architectural perspective	202	8.52	2005
How to Fake 1000 Registers	17	0.71	2005
Reducing the soft-error rate of a high-performance microprocessor	9	0.67	2004
Cache Scrubbing in Microprocessors: Myth or Necessity?	67	4.92	2004
A compressed memory hierarchy using an indirect index cache	17	0.87	2004
The Impact of Resource Partitioning on SMT Processors	67	3.00	2003
Guided Region Prefetching: A Cooperative Hardware/Software Approach.	0	0.34	2003
Measuring Architectural Vulnerability Factors	17	1.09	2003
A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor	460	24.57	2003
A scalable instruction queue design using dependence chains	64	2.44	2002
Detailed design and evaluation of redundant multi-threading alternatives	230	17.21	2002
Designing a modern memory hierarchy with hardware prefetching	22	1.42	2001
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design	103	8.24	2001
Integrating hardware and software concepts in a microprocessor-based system design lab	1	0.73	2000
A fully associative software-managed cache design	79	8.10	2000
Transient fault detection via simultaneous multithreading	317	18.15	2000
Hardware Support for Flexible Distributed Shared Memory	1	0.41	1998
Retrospective: tempest and typhoon: user-level shared memory	1	0.35	1998
Decoupled hardware support for distributed shared memory	41	2.53	1996

1
2
50 / page