Title
Specializing parallel data structures for Datalog
Abstract
We see a resurgence of Datalog in a variety of applications, including program analysis, networking, data integration, cloud computing, and security. The large-scale and complexity of these applications need the efficient management of data in relations. Hence, Datalog implementations require new data structures for managing relations that (1) are parallel, (2) are highly specialized for Datalog evaluation, and (3) can accommodate different workloads depending on the applications concerning memory consumption and computational efficiency. In this article, we present a data structure framework for relations that is specialized for shared-memory parallel Datalog implementations such as the souffle Datalog compiler. The data structure framework permits a portfolio of different data structures depending on the workload. We also introduce two concrete parallel data structures for relations, designed for various workloads. Our benchmarks demonstrate a speed-up of up to 6x by using a portfolio of data structures compared with using a B-tree alone, showing the advantage of our data structure framework.
Year
DOI
Venue
2022
10.1002/cpe.5643
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
Keywords
DocType
Volume
B-tree, Datalog, parallel data structure, Trie
Journal
34
Issue
ISSN
Citations 
2
1532-0626
0
PageRank 
References 
Authors
0.34
26
4
Name
Order
Citations
PageRank
Herbert Jordan17011.83
Pavle Subotić200.34
David Zhao300.34
Bernhard Scholz410410.59