Abstract | ||
---|---|---|
ABSTRACT The computer science community is paying more and more attention to data due to its crucial role in performing analysis and prediction. Researchers have proposed many data containers such as files, databases, data warehouses, cloud systems, and recently data lakes in the last decade. The latter enables holding data in its native format, making it suitable for performing massive data prediction, particularly for real-time application development. Although data lake is well adopted in the computer science industry, its acceptance by the research community is still in its infancy stage. This paper sheds light on existing works for performing analysis and predictions on data placed in data lakes. Our study reveals the necessary data management steps, which need to be followed in a decision process, and the requirements to be respected, namely curation, quality evaluation, privacy-preservation, and prediction. This study aims to categorize and analyze proposals related to each step mentioned above. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3472163.3472173 | IDEAS |
Keywords | DocType | ISSN |
Data management, Data lake, Systematic mapping | Conference | 1098-8068 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Firas Zouari | 1 | 0 | 0.34 |
Nadia Kabachi | 2 | 0 | 0.34 |
Khouloud Boukadi | 3 | 145 | 27.98 |
Chirine Ghedira Guegan | 4 | 11 | 8.03 |