P<sup>2</sup>M-DeTrack: Processing-in-Pixel-in-Memory for Energy-efficient and Real-Time Multi-Object Detection and Tracking - Citegraph

Paper Info

Title
P<sup>2</sup>M-DeTrack: Processing-in-Pixel-in-Memory for Energy-efficient and Real-Time Multi-Object Detection and Tracking

Abstract
Today’s high resolution, high frame rate cameras in autonomous vehicles generate a large volume of data that needs to be transferred and processed by a downstream processor or machine learning (ML) accelerator to enable intelligent computing tasks, such as multi-object detection and tracking. The massive amount of data transfer incurs significant energy, latency, and bandwidth bottlenecks, which hinders real-time processing. To mitigate this problem, we propose an algorithm-hardware co-design framework called Processing-in-Pixel-in-Memory-based object Detection and Tracking (P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> M-DeTrack). P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> M-DeTrack is based on a custom faster R-CNN-based model that is distributed partly inside the pixel array (front-end) and partly in a separate FPGA/ASIC (back-end). The proposed front-end in-pixel processing down-samples the input feature maps significantly with judiciously optimized strided convolution and pooling. Compared to a conventional baseline design that transfers frames of RGB pixels to the back-end, the resulting P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> M-DeTrack designs reduce the data bandwidth between sensor and back-end by up to 24×. The designs also reduce the sensor and total energy (obtained from in-house circuit simulations at Globalfoundries 22nm technology node) per frame by 5.7× and 1.14×, respectively. Lastly, they reduce the sensing and total frame latency by an estimated 1.7× and 3×, respectively. We evaluate our approach on the multi-object object detection (tracking) task of the large-scale BDD100K dataset and observe only a 0.5% reduction in the mean average precision (0.8% reduction in the identification F1 score) compared to the state-of-the-art.

Year	DOI	Venue
2022	10.1109/VLSI-SoC54400.2022.9939582	2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC)
Keywords	DocType	ISSN
autonomous vehicles,detection,tracking,processing-in-pixel-in-memory,faster R-CNN	Conference	2324-8432
ISBN	Citations	PageRank
978-1-6654-9006-1	0	0.34
References	Authors
12	14

Authors (14 rows)

Cited by (0 rows)

References (12 rows)

Name	Order	Citations	PageRank
Gourav Datta	1	0	0.34
Souvik Kundu	2	9	4.94
Zihan Yin	3	0	0.34
Joe Mathai	4	0	0.34
Zeyu Liu	5	0	0.34
Zixu Wang	6	0	0.34
Mulin Tian	7	0	0.34
Shunlin Lu	8	0	0.34
Ravi Teja Lakkireddy	9	0	0.34
Andrew Schmidt	10	0	0.34
Wael Abd-Almageed	11	248	24.52
Ajey Jacob	12	0	0.34
Akhilesh Jaiswal	13	0	0.34
Peter Beerel	14	0	0.34

1