Abstract | ||
---|---|---|
We present Holistic SparseCNN, a sparse convolutional neural network design that simultaneously optimizes convolution layers (for classification speed) and fully connected layers (for model size), while maintaining the accuracy. We directly apply convolutions to tensors without bandwidth-wasting lowering step, which is critical for sparse convolution that is more prone to be bandwidth bound than its dense counterpart. Our cross-layer training method balances sparsity among multiple layers to optimize the trade-off between accuracy, speed, and model size, and it is guided by the characteristics of underlying computing platforms. We demonstrate overall classification throughputs significantly higher than the best published numbers on Intel Xeon and Atom processors, which represent datacenter servers and resource-constrained mobile platforms, respectively. |
Year | Venue | DocType |
---|---|---|
2016 | CoRR | Journal |
Volume | Citations | PageRank |
abs/1608.01409 | 2 | 0.37 |
References | Authors | |
8 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jongsoo Park | 1 | 103 | 9.49 |
Sheng Li | 2 | 1598 | 53.64 |
Wei Wen | 3 | 353 | 18.09 |
Hai Li | 4 | 15 | 7.50 |
Yiran Chen | 5 | 3344 | 259.09 |
Pradeep K. Dubey | 6 | 3432 | 292.69 |