Title
Data Management in the Data Lake: A Systematic Mapping
Abstract
ABSTRACT The computer science community is paying more and more attention to data due to its crucial role in performing analysis and prediction. Researchers have proposed many data containers such as files, databases, data warehouses, cloud systems, and recently data lakes in the last decade. The latter enables holding data in its native format, making it suitable for performing massive data prediction, particularly for real-time application development. Although data lake is well adopted in the computer science industry, its acceptance by the research community is still in its infancy stage. This paper sheds light on existing works for performing analysis and predictions on data placed in data lakes. Our study reveals the necessary data management steps, which need to be followed in a decision process, and the requirements to be respected, namely curation, quality evaluation, privacy-preservation, and prediction. This study aims to categorize and analyze proposals related to each step mentioned above.
Year
DOI
Venue
2021
10.1145/3472163.3472173
IDEAS
Keywords
DocType
ISSN
Data management, Data lake, Systematic mapping
Conference
1098-8068
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Firas Zouari100.34
Nadia Kabachi200.34
Khouloud Boukadi314527.98
Chirine Ghedira Guegan4118.03