Accelerating Apache Spark with FPGAs. - Citegraph

Paper Info

Title
Accelerating Apache Spark with FPGAs.

Abstract
Apache Spark has become one of the most popular engines for big data processing. Spark provides a platform-independent, high-abstraction programming paradigm for large-scale data processing by leveraging the Java framework. Though it provides software portability across various machines, Java also limits the performance of distributed environments, such as Spark. While it may be unrealistic to rewrite platforms like Spark in a faster language, a more viable approach to mitigate its poor performance is to accelerate the computations while still working within the Java-based framework. This paper demonstrates the feasibility of incorporating Field-Programmable Gate Array (FPGA) acceleration into Spark and presents the performance benefits and bottlenecks of our FPGA-accelerated Spark environment using a MapReduce implementation of the k-means clustering algorithm, to show that acceleration is possible even when using a hardware platform that is not well optimized for performance. An important feature of our approach is that the use of FPGAs is completely transparent to the user through the use of library functions, which is a common way by which users access functions provided by Spark. Power users can further develop other computations using high-level synthesis.

Year	DOI	Venue
2019	10.1002/cpe.4222	CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
Keywords	Field	DocType
Apache Spark,big data,FPGA,high-level synthesis,Java,MapReduce	Spark (mathematics),Programming paradigm,Java collections framework,Computer science,Parallel computing,High-level synthesis,Field-programmable gate array,Software portability,Big data,Java,Distributed computing	Journal
Volume	Issue	ISSN
31	SP2	1532-0626
Citations	PageRank	References
1	0.38	8
Authors
2

Authors (2 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Ehsan Ghasemi	1	4	1.15
Paul Chow	2	868	119.97

1