Title
Empirical Analysis of Tree-Based Models for PM<inf>2.5</inf> Concentration Prediction
Abstract
Air pollution is an important issue that directly affects human health. In particular, the particulate matter (PM), one of the major components of air pollution, is produced by automobiles and factories. As the global interest in PM concentration increases, it is important to improve the prediction accuracy of PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> concentration. In this paper, we utilize treebased models such as random forests, XGBoost, and LightGBM to predict PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> concentration. Despite promising results of the tree-based models, there are challenging issues to solve: how to handle missing values and how to use output data predicted by the existing mathematical model. To tackle these issues, we exclude all missing data so that training models do not require sophisticated missing value processing, and also we show that the prediction performance by using concatenated data of the observation and output data predicted by the Community Multiscale Air Quality model is better than using the observation data only. The experimental results show that the XGBoost model outperforms other tree-based models with the root mean square error of 13.4977 and the mean absolute error of 10.1392.
Year
DOI
Venue
2019
10.1109/ICSPCS47537.2019.9008645
2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS)
Keywords
DocType
ISBN
Machine Learning,Predictive data,PM2.5 concentration prediction
Conference
978-1-7281-2195-6
Citations 
PageRank 
References 
0
0.34
2
Authors
7
Name
Order
Citations
PageRank
Juhyun Lee100.34
Yoojin Hong200.34
Younkwan Lee300.34
Hyun Soo Kim400.34
Chul Han Song500.34
Du Yong Kim613917.96
Moongu Jeon745672.81