Title | ||
---|---|---|
A 5.99-To-691.1tops/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization And Variable-Precision Quantization |
Abstract | ||
---|---|---|
Computing-in-memory (CIM) improves energy efficiency by enabling parallel multiply-and-accumulate (MAC) operations and reducing memory accesses [1 –4]. However, today’s typical neural networks (NNs) usually exceed on-chip memory capacity. Thus, a CIM-based processor may encounter a memory bottleneck [5]. Tensor-train (TT) is a tensor decomposition method, which decomposes a d-dimensional tensor to d 4D tensor-cores $\\left(\\operatorname{TCs}: G_{k}\\left[r_{k-1}, n_{k}, m_{k}, r_{k}\\right], k=1, \\ldots, d\\right)$ [6]. $G_{k}$ can be viewed as a $2D n_{k} \\times m_{k}$ array, where each element is an $r_{k-1} \\times r_{k}$ matrix. The TCs require $\\Sigma_{k \\in[1, d]} r_{k-1} n_{k} m_{k} r_{k}$ parameters to represent the original tensor, which has $\\Pi_{\\mathrm{k} \\in[1, \\mathrm{d}]} \\mathrm{n}_{\\mathrm{k}} \\mathrm{m}_{\\mathrm{k}}$ parameters. Since r k is typically small, kernels and weight matrices of convolutional, fully-connected and recurrent layers can be compressed significantly by using TT decomposition, thereby enabling storage of an entire NN in a CIM-based processor. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/ISSCC42613.2021.9365989 | 2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC) |
DocType | Volume | ISSN |
Conference | 64 | 0193-6530 |
Citations | PageRank | References |
1 | 0.36 | 0 |
Authors | ||
12 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ruiqi Guo | 1 | 13 | 3.36 |
Zhiheng Yue | 2 | 1 | 1.03 |
Xin Si | 3 | 49 | 6.86 |
te hu | 4 | 2 | 2.07 |
Hao Li | 5 | 261 | 85.92 |
Limei Tang | 6 | 1 | 0.70 |
Yabing Wang | 7 | 1 | 1.37 |
leibo liu | 8 | 816 | 116.95 |
Meng-Fan Chang | 9 | 459 | 45.63 |
Qiang Li | 10 | 81 | 21.66 |
Shaojun Wei | 11 | 555 | 102.32 |
shouyi yin | 12 | 579 | 99.95 |