Title | ||
---|---|---|
Lavoisier: A DSL for increasing the level of abstraction of data selection and formatting in data mining |
Abstract | ||
---|---|---|
Input data of a data mining algorithm must conform to a very specific tabular format. Data scientists arrange data into that format by creating long and complex scripts, where different low-level operations are performed, and which can be a time-consuming and error-prone process. To alleviate this situation, we present Lavoisier, a declarative language for data selection and formatting in a data mining context. Using Lavoisier, script size for data preparation can be reduced by ∼40% on average, and by up to 80% in some cases. Additionally, accidental complexity present in state-of-the-art technologies is considerably mitigated. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1016/j.cola.2020.100987 | Journal of Computer Languages |
Keywords | DocType | Volume |
Data selection,Data formatting,Domain-specific languages,Data mining | Journal | 60 |
ISSN | Citations | PageRank |
2590-1184 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Alfonso de la Vega | 1 | 3 | 3.76 |
Diego García-Saiz | 2 | 0 | 0.34 |
Marta E. Zorrilla | 3 | 51 | 16.05 |
Pablo Sánchez | 4 | 50 | 12.01 |