Abstract | ||
---|---|---|
Applying data mining and machine learning algorithms requires many steps to prepare data and to make use of modeling results. This study investigates two questions: (1) how time consuming are the pre- and post-processing steps? (2) how much research energy is spent on these steps? To answer these questions I surveyed practitioners about their experiences in applying modeling techniques and categorized data mining and machine learning research papers from 2009 according to the modeling step(s) they addressed. Survey results show that model building consumes only 14% of the time spent on a typical project; the remaining time is spent on pre- and post-processing steps. Both survey responses and the categorization of research papers show that data mining and machine learning researchers spend the majority of their energy on algorithms for constructing models and significantly less energy on other steps. These findings collectively suggest that there are research opportunities to simplify the steps that precede and follow model building. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1145/2207243.2207253 | SIGKDD Explorations |
Keywords | DocType | Volume |
data mining,modeling step,research energy,post-processing step,different modeling step,model building,research paper,remaining time,modeling technique,research opportunity,time consuming,machine learning,categorical data | Journal | 13 |
Issue | Citations | PageRank |
2 | 6 | 0.51 |
References | Authors | |
3 | 1 |
Name | Order | Citations | PageRank |
---|---|---|---|
M. Arthur Munson | 1 | 28 | 1.81 |