Title
Lavoisier: A DSL for increasing the level of abstraction of data selection and formatting in data mining
Abstract
Input data of a data mining algorithm must conform to a very specific tabular format. Data scientists arrange data into that format by creating long and complex scripts, where different low-level operations are performed, and which can be a time-consuming and error-prone process. To alleviate this situation, we present Lavoisier, a declarative language for data selection and formatting in a data mining context. Using Lavoisier, script size for data preparation can be reduced by ∼40% on average, and by up to 80% in some cases. Additionally, accidental complexity present in state-of-the-art technologies is considerably mitigated.
Year
DOI
Venue
2020
10.1016/j.cola.2020.100987
Journal of Computer Languages
Keywords
DocType
Volume
Data selection,Data formatting,Domain-specific languages,Data mining
Journal
60
ISSN
Citations 
PageRank 
2590-1184
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Alfonso de la Vega133.76
Diego García-Saiz200.34
Marta E. Zorrilla35116.05
Pablo Sánchez45012.01