Title
Modelling Machine Learning Algorithms on Relational Data with Datalog
Abstract
The standard process of data science tasks is to prepare features inside a database, export them as a denormalized data frame and then apply machine learning algorithms. This process is not optimal for two reasons. First, it requires denormalization of the database that can convert a small data problem into a big data problem. The second shortcoming is that it assumes that the machine learning algorithm is disentangled from the relational model of the problem. That seems to be a serious limitation since the relational model contains very valuable domain expertise. In this paper we explore the use of convex optimization and specifically linear programming, for modelling machine learning algorithms on relational data in an integrated way with data processing operators. We are using SolverBlox, a framework that accepts as an input Datalog code and feeds it into a linear programming solver. We demonstrate the expression of common machine learning algorithms and present use case scenarios where combining data processing with modelling of optimization problems inside a database offers significant advantages.
Year
DOI
Venue
2018
10.1145/3209889.3209893
DEEM@SIGMOD
Field
DocType
ISBN
Small data,Relational database,Computer science,Algorithm,Artificial intelligence,Solver,Denormalization,Relational model,Optimization problem,Big data,Datalog,Machine learning
Conference
978-1-4503-5828-6
Citations 
PageRank 
References 
0
0.34
11
Authors
4
Name
Order
Citations
PageRank
Nantia Makrynioti112.38
Nikolaos Vasiloglou21448.27
Emir Pasalic319255.42
Vasilis Vassalos41189144.02