文章基本信息

标题：Evaluation of Boosted Regression Tree for the Prediction of the Maximum 24-Hour Concentration of Particulate Matter
本地全文：下载
作者：Wan Nur Shaziayani ; the Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 13500 Permatang Pauh, Malaysia ; Ahmad Zia Ul-Saufie 等
期刊名称：International Journal of Environmental Science and Development
印刷版ISSN：2010-0264
出版年度：2021
卷号：12
期号：4
页码：126-130
DOI：10.18178/ijesd.2021.12.4.1329
摘要：Air pollution is a considerable health danger to the environment. The objective of this study was to assess the characteristics of air quality and predict PM10 concentrations using boosted regression trees (BRTs). The maximum daily PM10 concentration data from 2002 to 2016 were obtained from the air quality monitoring station in Kuching, Sarawak. Eighty percent of the monitoring records were used for the training and twenty percent for the validation of the models. The best iteration of the BRT model was performed by optimizing the prediction performance, while the BRT algorithm model was constructed from multiple regression models. The two main parameters that were used were the learning rate (lr) and tree complexity (tc), which were fixed at 0.01 and 5, respectively. Meanwhile, the number of trees (nt) was determined by using an independent test set (test), a 5-fold cross validation (CV) and out-of-bag (OOB) estimation. The algorithm model for the BRT produced by using the CV was the best guide to be used compared with the OOB to test the predicted PM10 concentration. The performance indicators showed that the model was adequate for the next day’s prediction (PA=0.638, R2=0.427, IA=0.749, NAE=0.267, and RMSE=28.455).
其他摘要：Air pollution is a considerable health danger to the environment. The objective of this study was to assess the characteristics of air quality and predict PM10 concentrations using boosted regression trees (BRTs). The maximum daily PM10 concentration data from 2002 to 2016 were obtained from the air quality monitoring station in Kuching, Sarawak. Eighty percent of the monitoring records were used for the training and twenty percent for the validation of the models. The best iteration of the BRT model was performed by optimizing the prediction performance, while the BRT algorithm model was constructed from multiple regression models. The two main parameters that were used were the learning rate (lr) and tree complexity (tc), which were fixed at 0.01 and 5, respectively. Meanwhile, the number of trees (nt) was determined by using an independent test set (test), a 5-fold cross validation (CV) and out-of-bag (OOB) estimation. The algorithm model for the BRT produced by using the CV was the best guide to be used compared with the OOB to test the predicted PM10 concentration. The performance indicators showed that the model was adequate for the next day’s prediction (PA=0.638, R2=0.427, IA=0.749, NAE=0.267, and RMSE=28.455).
关键词：Accuracy measures; air Pollution; boosted regression trees; PM10; regression.