PiCo: High-performance data analytics pipelines in modern C++. - Citegraph

Paper Info

Title
PiCo: High-performance data analytics pipelines in modern C++.

Abstract
In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo’s programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: (1) unifying batch and stream data access models, (2) decoupling processing from data layout, and (3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to reuse the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo, when compared to Spark and Flink, can attain better performances in terms of execution times and can hugely improve memory utilization, both for batch and stream processing.

Year	DOI	Venue
2018	10.1016/j.future.2018.05.030	Future Generation Computer Systems
Keywords	Field	DocType
Big data,High performance data analytics,Domain specific language,C++ ,Stream computing,Fog computing,Edge computing	Data modeling,Programming paradigm,Data analysis,Computer science,Stream,Data type,Stream processing,Big data,Runtime system,Distributed computing	Journal
Volume	ISSN	Citations
87	0167-739X	1
PageRank	References	Authors
0.36	5	5

Authors (5 rows)

Cited by (1 rows)

References (5 rows)

Name	Order	Citations	PageRank
Claudia Misale	1	23	5.44
Maurizio Drocco	2	88	12.09
Guy Tremblay	3	79	9.49
Alberto R. Martinelli	4	1	0.36
Marco Aldinucci	5	638	59.87

1