Recurrent Neural Networks With Column-Wise Matrix–Vector Multiplication on FPGAs | 0 | 0.34 | 2022 |
FPGA-Based AI Smart NICs for Scalable Distributed AI Training Systems | 0 | 0.34 | 2022 |
Architecture and Application Co-Design for Beyond-FPGA Reconfigurable Acceleration Devices | 0 | 0.34 | 2022 |
HPVM2FPGA: Enabling True Hardware-Agnostic FPGA Programming | 0 | 0.34 | 2022 |
Cache Compression with Efficient in-SRAM Data Comparison | 0 | 0.34 | 2021 |
DO-GPU: Domain Optimizable Soft GPUs | 0 | 0.34 | 2021 |
FlexScore: Quantifying Flexibility | 0 | 0.34 | 2021 |
Stratix 10 NX Architecture and Applications | 0 | 0.34 | 2021 |
Enhancing High-Level Synthesis Using a Meta-Programming Approach | 0 | 0.34 | 2021 |
Compute-Capable Block RAMs for Efficient Deep Learning Acceleration on FPGAs | 1 | 0.35 | 2021 |
End-to-End FPGA-based Object Detection Using Pipelined CNN and Non-Maximum Suppression | 0 | 0.34 | 2021 |
Optimizing Reconfigurable Recurrent Neural Networks | 1 | 0.39 | 2020 |
Scalable Multi-Fpga Acceleration For Large Rnns With Full Parallelism Levels | 0 | 0.34 | 2020 |
FPGA-based low-batch training accelerator for modern CNNs featuring high bandwidth memory | 0 | 0.34 | 2020 |
SLATE: Managing Heterogeneous Cloud Functions | 0 | 0.34 | 2020 |
Artisan: a Meta-Programming Approach For Codifying Optimisation Strategies | 1 | 0.36 | 2020 |
Specializing FGPU for Persistent Deep Learning | 1 | 0.34 | 2019 |
Evaluating and Enhancing Intel® Stratix® 10 FPGAs for Persistent Real-Time AI. | 0 | 0.34 | 2019 |
Automatic Compiler Based FPGA Accelerator for CNN Training | 2 | 0.46 | 2019 |
Dark Wires and the Opportunities for Reconfigurable Logic | 0 | 0.34 | 2019 |
FPGA-based Computing in the Era of AI and Big Data. | 0 | 0.34 | 2019 |
Enhanced Heterogeneous Cloud: Transparent Acceleration and Elasticity | 1 | 0.48 | 2019 |
Why Compete When You Can Work Together: FPGA-ASIC Integration for Persistent RNNs | 6 | 0.60 | 2019 |
Processor Assisted Worklist Scheduling for FPGA Accelerated Graph Processing on a Shared-Memory Platform | 3 | 0.37 | 2019 |
Scalable Low-Latency Persistent Neural Machine Translation on CPU Server with Multiple FPGAs | 0 | 0.34 | 2019 |
In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC(Abstract Only). | 0 | 0.34 | 2018 |
Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs: (Abstract Only). | 0 | 0.34 | 2018 |
Evaluating The Highly-Pipelined Intel Stratix 10 FPGA Architecture Using Open-Source Benchmarks | 2 | 0.42 | 2018 |
A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study. | 8 | 0.58 | 2018 |
In-Package Domain-Specific ASICs for Intel® Stratix® 10 FPGAs: A Case Study of Accelerating Deep Learning Using TensorTile ASIC | 0 | 0.34 | 2018 |
WRPN: Wide Reduced-Precision Networks. | 20 | 0.73 | 2017 |
Customizable FPGA OpenCL matrix multiply design template for deep neural networks | 2 | 0.40 | 2017 |
High performance binary neural networks on the Xeon+FPGA™ platform | 11 | 0.87 | 2017 |
Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? | 60 | 2.55 | 2017 |
A Study of Pointer-Chasing Performance on Shared-Memory Processor-FPGA Systems. | 12 | 0.60 | 2016 |
Hardware Accelerator For Analytics Of Sparse Data | 1 | 0.48 | 2016 |
Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC | 11 | 0.67 | 2016 |
Accelerating recurrent neural networks in analytics servers: Comparison of FPGA, CPU, GPU, and ASIC | 21 | 1.52 | 2016 |
Fast hierarchical implementation of sequential tree-reweighted belief propagation for probabilistic inference | 5 | 0.46 | 2015 |
A sparse matrix vector multiply accelerator for support vector machine | 10 | 0.54 | 2015 |
Programmable Automotive Headlights | 9 | 0.60 | 2014 |
GraphGen: An FPGA Framework for Vertex-Centric Graph Computation. | 42 | 1.48 | 2014 |
MEMOCODE 2013 hardware/software co-design contest: Stereo matching | 3 | 0.46 | 2013 |
3D Point Cloud Reduction Using Mixed-Integer Quadratic Programming | 12 | 0.58 | 2013 |
Integrating formal verification and high-level processor pipeline synthesis | 0 | 0.34 | 2011 |
Automatic pipelining from transactional datapath specifications | 10 | 0.65 | 2010 |
Automatic multithreaded pipeline synthesis from transactional datapath specifications | 6 | 0.50 | 2010 |
ProtoFlex: Towards Scalable, Full-System Multiprocessor Simulations Using FPGAs | 54 | 1.99 | 2009 |
A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs | 28 | 1.30 | 2008 |
Active Cache Emulator | 1 | 0.37 | 2008 |