Title | ||
---|---|---|
14.4 A 21.5M-query-vectors/s 3.37nJ/vector reconfigurable k-nearest-neighbor accelerator with adaptive precision in 14nm tri-gate CMOS. |
Abstract | ||
---|---|---|
Energy-efficient k-nearest-neighbor (kNN) computations are key building blocks for computer vision, classification, and machine-learning workloads [1–3]. Determining distances to high-dimensional vectors within a large vector database results in high compute cost. Adaptive precision improves energy efficiency by eliminating a majority of vectors without costly full-precision computation, with as-needed precision refinement to guarantee kNN accuracy of closely matched vectors. A special-purpose on-die kNN accelerator with 128-dimensions by 128 parallel reference vectors, targeted across mobile SoCs to multi-core microprocessors, and reconfigurable for either Manhattan or Euclidean distance, is fabricated in 14nm tri-gate CMOS [6]. Partial distance compute circuits, 2b window-based sort, MSB-to-LSB-based selective distance refinement, robust ultra-low voltage circuits, and state tracking control to selectively resume next-nearest candidates enable nominal energy efficiency of 3.37nJ/query vector or 9.7TOPS/W (measured for 21.5M vectors/s, 16 cycles/vector at 750mV, 25°C) with a dense layout occupying 0.333mm2 (Fig. 14.4.7) while achieving: i) scalable performance up to 26.4M vectors/s, 114mW measured at 850mV, ii) 2-cycle latency and 43pJ energy to find each subsequent nearest neighbor, iii) up to 5.2× higher throughput while maintaining full-precision kNN accuracy, iv) 16× search-space reduction for next-nearest neighbor, v) ultra-low voltage operation measured at 360mV, 1.1M vectors/s, 1.44mW, and vi) peak energy efficiency of 1.23nJ/vector at 390mV (near-threshold), 25°C. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/ISSCC.2016.7418006 | ISSCC |
Keywords | Field | DocType |
CMOS integrated circuits,computer vision,learning (artificial intelligence),low-power electronics,microprocessor chips,system-on-chip,Euclidean distance,MSB-to-LSB based selective distance refinement,Manhattan distance,computer vision,energy-efficient k-nearest-neighbor,k-nearest-neighbor accelerator,kNN,machine learning,mobile SoC,multicore microprocessors,partial distance compute circuits,size 4 nm,trigate CMOS,ultralow voltage circuits,voltage 360 mV | k-nearest neighbors algorithm,Computer science,Efficient energy use,Euclidean distance,Robustness (computer science),Electronic engineering,CMOS,Throughput,Electrical engineering,Scalability,Computation | Conference |
Citations | PageRank | References |
2 | 0.47 | 5 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Himanshu Kaul | 1 | 456 | 51.07 |
Mark A. Anders | 2 | 12 | 3.03 |
S. Mathew | 3 | 462 | 76.59 |
Gregory K. Chen | 4 | 298 | 32.96 |
Sudhir Satpathy | 5 | 269 | 19.69 |
S. K. Hsu | 6 | 521 | 52.06 |
amit agarwal | 7 | 65 | 5.39 |
Ram Krishnamurthy | 8 | 650 | 74.63 |