Title
Parallel Pipelined Architecture and Algorithm for Matrix Transposition Using Registers
Abstract
In this brief, we present a new algorithm and architecture for continuous-flow matrix transposition using registers. The algorithm supports P-parallel matrix transposition. The hardware architecture reaches the theoretical minimums in terms of latency and memory. It is composed of a group of identical cascaded basic swap circuits, whose stages are determined by the corresponding algorithm, and can be controlled via a set of counters. Compared with the state-of-the-art architecture, the proposed architecture supports matrices whose rows and columns are integer multiples of P. Here P can be arbitrary, including but not limited to power-of-two integers. Moreover, our results provide additional insight into continuous-flow non-square matrix transposition.
Year
DOI
Venue
2022
10.1109/TCSII.2021.3134710
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS
Keywords
DocType
Volume
Computer architecture, Arrays, Hardware, Signal processing algorithms, Registers, Parallel processing, Circuits and systems, Pipelined algorithm, hardware architecture, continuous-flow, matrix transposition, parallel computing
Journal
69
Issue
ISSN
Citations 
3
1549-7747
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Bo Zhang1419.80
Zhen-guo Ma2144.86
Wei Luo300.34