Title
Empowering big data analytics with polystore and strongly typed functional queries
Abstract
ABSTRACTPolystores are of primary importance to tackle the diversity and the volume of Big Data, as they propose to store data according to specific use cases. Nevertheless, analytics frameworks often lack a uniform interface allowing to fully access and take advantage of the various models offered by the polystore. It also should be ensured that the typing of the algebraic expressions built with data manipulation operators can be checked and that schema can be inferred before starting to execute the operators (type-safe). Tensors are good candidates for supporting a pivot data model. They are powerful abstract mathematical objects which can embed complex relationships between entities and that are used in major analytics frameworks. However, they are far away from data models, and lack high level operators to manipulate their content, resulting in bad coding habits and less maintainability, and sometimes poor performances. With TDM (Tensor Data Model), we propose to join the best of both worlds, to take advantage of modeling capabilities of tensors by adding schema and data manipulation operators to them. We developed an implementation in Scala using Spark, providing users with a type-safe and schema inference mechanism that guarantees the technical and functional correctness of composed expressions on tensors at compile time. We show that this extension does not induce overhead and allows to outperform Spark query optimizer using bind join.
Year
DOI
Venue
2020
10.1145/3410566.3410591
IDEAS
DocType
Citations 
PageRank 
Conference
1
0.35
References 
Authors
0
4
Name
Order
Citations
PageRank
Annabelle Gillet110.35
Eric Leclercq22815.63
Marinette Savonnet32416.76
Nadine Cullot412415.95