Title
Categories for (Big) Data models and optimization
Abstract
This paper proposes a theoretical foundation for Big Data. More precisely, it explains how “functors”, a concept coming from Category Theory, can serve to model the various data structures commonly used to represent (large) data sets, and how “natural transformations” can formalize relations between these structures. Algorithms, such as querying a precise information, mainly depend on the data structure considered, and thus natural transformations can serve to optimize these algorithms and get a result in a shorter time. The paper details four functors modeling tabular data, graph structures (e.g. triple stores), cached and split data. Next, the paper explains how, by considering a functional programming language, the concepts can be implemented without effort to propose new tools (e.g. efficient information servers and query languages). And, as a complement to the mathematical models proposed, the paper also presents a optimized data server and a specific query language (based on “unification” to facilitates the search of information). Finally, the paper gives a comparison study and shows that this tool is more efficient than most of the standards available in the market: the functional server appears to be 10+ times faster than relational or document oriented databases (Mysql and MongoDB), and 100+ times faster than a graph database (Neo4j).
Year
DOI
Venue
2018
10.1186/s40537-018-0132-9
Journal of Big Data
Keywords
Field
DocType
Data models,Category Theory,Functional programming,Performance
Data modeling,Data structure,Data mining,Query language,Graph database,Functional programming,Computer science,Server,Theoretical computer science,Database server,Big data
Journal
Volume
Issue
ISSN
5
1
2196-1115
Citations 
PageRank 
References 
2
0.43
13
Authors
3
Name
Order
Citations
PageRank
Laurent Thiry1327.60
Heng Zhao2325.34
Michel Hassenforder36111.05