MLPerf Training Benchmark | 0 | 0.34 | 2020 |
MLPerf Inference Benchmark | 15 | 0.68 | 2020 |
MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance. | 8 | 0.54 | 2020 |
Coloring Big Graphs with AlphaGoZero. | 2 | 0.35 | 2019 |
Beyond human-level accuracy - computational challenges in deep learning. | 3 | 0.37 | 2019 |
EPNAS: Efficient Progressive Neural Architecture Search | 0 | 0.34 | 2019 |
Fast Spectrogram Inversion using Multi-head Convolutional Neural Networks. | 8 | 0.53 | 2019 |
Mixed Precision Training. | 0 | 0.34 | 2018 |
Language Modeling At Scale | 0 | 0.34 | 2018 |
Deep Learning Scaling is Predictable, Empirically. | 8 | 0.44 | 2017 |
Exploring Sparsity in Recurrent Neural Networks. | 23 | 0.73 | 2017 |
Deep Voice: Real-time Neural Text-to-Speech. | 34 | 1.66 | 2017 |
Deep Voice 2: Multi-Speaker Neural Text-to-Speech. | 21 | 0.91 | 2017 |
Persistent RNNs: Stashing Recurrent Weights On-Chip. | 12 | 0.73 | 2016 |
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin | 251 | 8.33 | 2015 |
Red Fox: An Execution Environment for Relational Query Processing on GPUs | 26 | 0.85 | 2014 |
Deep Speech: Scaling up end-to-end speech recognition. | 185 | 8.06 | 2014 |
Relational algorithms for multi-bulk-synchronous processors | 12 | 0.69 | 2013 |
Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation | 41 | 1.30 | 2012 |
Dynamic compilation of data-parallel kernels for vector processors | 8 | 0.57 | 2012 |
Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission | 28 | 1.06 | 2012 |
Simultaneous branch and warp interweaving for sustained GPU performance | 14 | 0.66 | 2012 |
Characterization and transformation of unstructured control flow in bulk synchronous GPU applications | 7 | 0.70 | 2012 |
A framework for dynamically instrumenting GPU compute applications within GPU Ocelot | 29 | 1.42 | 2011 |
SIMD re-convergence at thread frontiers | 41 | 1.27 | 2011 |
Speculative execution on multi-GPU systems | 11 | 0.60 | 2010 |
Ocelot: a dynamic optimization framework for bulk-synchronous applications in heterogeneous systems | 110 | 5.51 | 2010 |
Modeling GPU-CPU workloads and systems | 44 | 2.20 | 2010 |
A characterization and analysis of PTX kernels | 74 | 4.83 | 2009 |
Harmony: an execution model and runtime for heterogeneous many core systems | 102 | 4.71 | 2008 |