首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:Deep Ensemble Machine Learning Framework for the Estimation of P M 2.5 Concentrations
  • 本地全文:下载
  • 作者:Wenhua Yu ; Shanshan Li ; Tingting Ye
  • 期刊名称:Environmental Health Perspectives
  • 印刷版ISSN:0091-6765
  • 电子版ISSN:1552-9924
  • 出版年度:2022
  • 卷号:130
  • 期号:3
  • DOI:10.1289/EHP9752
  • 语种:English
  • 出版社:OCR Subscription Services Inc
  • 摘要:Background: Accurate estimation of historical PM 2.5 (particle matter with an aerodynamic diameter of less than 2.5 μ m ) is critical and essential for environmental health risk assessment. Objectives: The aim of this study was to develop a multiple-level stacked ensemble machine learning framework for improving the estimation of the daily ground-level PM 2.5 concentrations. Methods: An innovative deep ensemble machine learning framework (DEML) was developed to estimate the daily PM 2.5 concentrations. The framework has a three-stage structure: At the first stage, four base models [gradient boosting machine (GBM), support vector machine (SVM), random forest (RF), and eXtreme gradient boosting (XGBoost)] were used to generate a new data set of PM 2.5 concentrations for training the next-stage learners. At the second stage, three meta-models [RF, XGBoost, and Generalized Linear Model (GLM)] were used to estimate PM 2.5 concentrations using a combination of the original data set and the predictions from the first-stage models. At the third stage, a nonnegative least squares (NNLS) algorithm was employed to obtain the optimal weights for PM 2.5 estimation. We took the data from 133 monitoring stations in Italy as an example to implement the DEML to predict daily PM 2.5 at each 1 km × 1 km grid cell from 2015 to 2019 across Italy. We evaluated the model performance by performing 10-fold cross-validation (CV) and compared it with five benchmark algorithms [GBM, SVM, RF, XGBoost, and Super Learner (SL)]. Results: The results revealed that the PM 2.5 prediction performance of DEML [coefficients of determination ( R 2 ) = 0.87 and root mean square error ( RMSE ) = 5.3 8 μ g / m 3 ] was superior to any benchmark models (with R 2 of 0.51, 0.76, 0.83, 0.70, and 0.83 for GBM, SVM, RF, XGBoost, and SL approach, respectively). DEML displayed reliable performance in capturing the spatiotemporal variations of PM 2.5 in Italy. Discussion: The proposed DEML framework achieved an outstanding performance in PM 2.5 estimation, which could be used as a tool for more accurate environmental exposure assessment. https://doi.org/10.1289/EHP9752
国家哲学社会科学文献中心版权所有