Opportunistic computing in GPU architectures | 6 | 0.40 | 2019 |
In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC(Abstract Only). | 0 | 0.34 | 2018 |
Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs: (Abstract Only). | 0 | 0.34 | 2018 |
WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics. | 1 | 0.37 | 2018 |
A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study. | 8 | 0.58 | 2018 |
In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC | 0 | 0.34 | 2018 |
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy. | 20 | 0.66 | 2017 |
WRPN: Wide Reduced-Precision Networks. | 20 | 0.73 | 2017 |
WRPN: Training and Inference using Wide Reduced-Precision Networks. | 2 | 0.37 | 2017 |
Low Precision RNNs: Quantizing RNNs Without Losing Accuracy. | 1 | 0.36 | 2017 |
High performance binary neural networks on the Xeon+FPGA™ platform | 11 | 0.87 | 2017 |
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities. | 52 | 0.88 | 2016 |
Hardware Accelerator For Analytics Of Sparse Data | 1 | 0.48 | 2016 |
Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC | 11 | 0.67 | 2016 |
From high-level deep neural models to FPGAs. | 35 | 0.96 | 2016 |
Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC | 21 | 1.52 | 2016 |
A sparse matrix vector multiply accelerator for support vector machine | 10 | 0.54 | 2015 |
Tangle: Route-oriented dynamic voltage minimization for variation-afflicted, energy-efficient on-chip networks | 15 | 0.59 | 2014 |
Runnemede: An architecture for Ubiquitous High-Performance Computing | 38 | 1.17 | 2013 |
Orchestrated scheduling and prefetching for GPGPUs | 87 | 1.89 | 2013 |
A heterogeneous multiple network-on-chip design: An application-aware approach | 48 | 1.20 | 2013 |
Application-aware prefetch prioritization in on-chip networks | 9 | 0.49 | 2012 |
Cache revive: Architecting volatile STT-RAM caches for enhanced performance in CMPs | 120 | 3.38 | 2012 |
PEPON: performance-aware hierarchical power budgeting for NoC based multicores | 18 | 0.69 | 2012 |
A case for heterogeneous on-chip interconnects for CMPs | 60 | 1.79 | 2011 |
METE: meeting end-to-end QoS in multicores through system-wide resource management | 38 | 1.02 | 2011 |
Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs | 48 | 1.77 | 2011 |
An energy-efficient heterogeneous CMP based on hybrid TFET-CMOS cores | 29 | 2.04 | 2011 |
Exploiting Heterogeneity For Energy Efficiency In Chip Multiprocessors | 25 | 1.39 | 2011 |
RAFT: A router architecture with frequency tuning for on-chip networks | 19 | 0.71 | 2011 |
ACCESS: Smart scheduling for asymmetric cache CMPs | 17 | 0.67 | 2011 |
Towards characterizing cloud backend workloads: insights from Google compute clusters | 86 | 3.85 | 2010 |
Coordinated power management of voltage islands in CMPs | 2 | 0.40 | 2010 |
CPM in CMPs: Coordinated Power Management in Chip-Multiprocessors | 31 | 1.16 | 2010 |
A case for dynamic frequency tuning in on-chip networks | 61 | 1.75 | 2009 |
A case for integrated processor-cache partitioning in chip multiprocessors | 14 | 0.66 | 2009 |
Design And Evaluation Of A Hierarchical On-Chip Interconnect For Next-Generation Cmps | 106 | 3.22 | 2009 |
Detection Of Arcing In Low Voltage Distribution Systems | 0 | 0.34 | 2008 |
Performance And Power Optimization Through Data Compression In Network-On-Chip Architectures | 44 | 1.74 | 2008 |
MIRA: A Multi-layered On-Chip Interconnect Router Architecture | 102 | 3.92 | 2008 |