Title
Rdma-Based Apache Storm For High-Performance Stream Data Processing
Abstract
Apache Storm is a scalable fault-tolerant distributed real time stream-processing framework widely used in big data applications. For distributed data-sensitive applications, low-latency, high-throughput communication modules have a critical impact on overall system performance. Apache Storm currently uses Netty as its communication component, an asynchronous server/client framework based on TCP/IP protocol stack. The TCP/IP protocol stack has inherent performance flaws due to frequent memory copying and context switching. The Netty component not only limits the performance of the Storm but also increases the CPU load in the IPoIB (IP over InfiniBand) communication mode. In this paper, we introduce two new implementations for Apache Storm communication components with the help of RDMA technology. The performance evaluation on Mellanox QDR Cards (40 Gbps) shows that our implementations can achieve speedup up to 5x compared with IPoIB and 10 x with Gigabit Ethernet. Our implementations also significantly reduce the CPU load and increase the throughput of the system.
Year
DOI
Venue
2021
10.1007/s10766-021-00696-0
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
Keywords
DocType
Volume
Apache Storm, RDMA, InfiniBand, Stream-processing framework, Cloud computing, Communication optimization
Journal
49
Issue
ISSN
Citations 
5
0885-7458
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Ziyu Zhang111210.19
Zitan Liu200.34
Qingcai Jiang300.34
Junshi Chen400.34
Hong An55824.15