Title
Battling the CPU Bottleneck in Apache Parquet to Arrow Conversion Using FPGA
Abstract
In the domain of big data analytics, the bottleneck of converting storage-focused file formats to in-memory data structures has shifted from the bandwidth of storage to the performance of decoding and decompression software. Two widely used formats for big data storage and in-memory data are Apache Parquet and Apache Arrow, respectively. In order to improve the speed at which data can be loaded fr...
Year
DOI
Venue
2020
10.1109/ICFPT51103.2020.00048
2020 International Conference on Field-Programmable Technology (ICFPT)
Keywords
DocType
ISBN
Bandwidth,Metadata,Big Data,Data structures,Throughput,Encoding,Decoding
Conference
978-1-6654-2302-1
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Johan Peltenburg123.15
Lars T.J. van Leeuwen200.34
Joost Hoozemans300.34
Jian Fang418618.77
Zaid Al-Ars556078.62
H. P. Hofstee650754.92