Title
Hardware Support for Accelerating Data Movement in Server Platform
Abstract
Data movement (memory copies) is a very common operation during network processing and application execution on servers. The performance of this operation is rather poor on today's microprocessors due to the following aspects: 1) Several long-latency memory accesses are involved because the source and/or the destination are typically in memory, 2) latency hiding techniques, such as out-of-order execution, hardware threading, and prefetching, are not very effective for bulk data movement, and 3) microprocessors move data at register (small) granularity. In this paper, we show this overhead of bulk data movement and propose the use of dedicated copy engines to minimize it. We present a detailed analysis of copy engine architectures along two dimensions: 1) on-die versus off-die and 2) synchronous versus asynchronous. These copy engine architectures are superior to traditional direct memory access (DMA) engines because they are tightly coupled to the core architecture and enable lower overhead communication and signaling. We describe the hardware support required to implement these copy engines and integrate them into server platforms. We perform a detailed case study to evaluate the performance of these copy engines. The evaluation is based on an execution-driven simulator, which was extended with detailed models of copy engines. Our simulation results show that copy engines are effective in reducing the bulk data movement overhead and, hence, hold significant promise for high-performance server platforms
Year
DOI
Venue
2007
10.1109/TC.2007.1036
IEEE Trans. Computers
Keywords
Field
DocType
copy engine,long-latency memory access,hardware acceleration,network operating systems,accelerating data movement,detailed model,detailed case study,data movement acceleration,storage management,bulk data movement overhead,copy engine architecture,servers,server platform,tcp/ip,transport protocols,data movement,memory copy,detailed analysis,direct memory access,execution-driven simulator,bulk data movement,hardware support,performance evaluation.,dedicated copy engine,out of order,acceleration,out of order execution,two dimensions,hardware,hardware accelerator,registers,tcp ip,engines
Computer science,Latency (engineering),Server,Real-time computing,Direct memory access,Granularity,Network processing,Computer hardware,Asynchronous communication,Parallel computing,Internet protocol suite,Hardware acceleration,Operating system,Embedded system
Journal
Volume
Issue
ISSN
56
6
0018-9340
Citations 
PageRank 
References 
12
0.74
12
Authors
5
Name
Order
Citations
PageRank
Li Zhao160434.84
Laxmi N. Bhuyan22393248.44
Ravishankar K. Iyer3111975.72
Srihari Makineni460037.89
Donald Newell5866.92