Title
Dataset Retrieval System Based On Automation Of Data Preparation With Dataset Description Model
Abstract
Data preparation is the most effortful task in the process of statistical learning. Many studies related to data mining are performed without data preparation by assuming that qualified datasets are already prepared. It may hide useful patterns of data, which can result in poor performance and incorrect learning. Automation of data preparation can solve these problems. For automation of data preparation, a few issues should be considered, such as flexible expression of requirements according to the purpose of the learning model, accessibility to data sources, and performance degradation due to automation. In this paper, we propose a dataset description model that can express the requirements for data processing and dataset retrieval system based on automated data preparation. The proposed system makes it possible to provide good quality datasets for statistical learning applications using data preparation methods such as data acquisition, refinement, and organization. In the experiment, we demonstrate that the proposed system doesn't have performance loss as compared to the existing manual systems. Moreover, the quality of the datasets are also improved by using the proposed system.
Year
DOI
Venue
2021
10.1002/cpe.5288
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
Keywords
DocType
Volume
data preparation, dataset description, dataset retrieval
Journal
33
Issue
ISSN
Citations 
2
1532-0626
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Jonghyeok Mun100.68
Sanghwan Lee200.34
Jongsun Choi362.59
Jae-Young Choi4783110.19
Kitae Bae532.77