可解释性
残余物
梯度升压
空气质量指数
计算机科学
集合预报
集成学习
Boosting(机器学习)
均方误差
机器学习
数据挖掘
人工智能
随机森林
统计
气象学
数学
地理
算法
作者
Chenliang Tao,Man Jia,Guoqiang Wang,Yuqiang Zhang,Qingzhu Zhang,Xianfeng Wang,Qiao Wang,Wenxing Wang
标识
DOI:10.1016/j.jes.2023.02.026
摘要
Nitrogen dioxide (NO2) poses a critical potential risk to environmental quality and public health. A reliable machine learning (ML) forecasting framework will be useful to provide valuable information to support government decision-making. Based on the data from 1609 air quality monitors across China from 2014-2020, this study designed an ensemble ML model by integrating multiple types of spatial-temporal variables and three sub-models for time-sensitive prediction over a wide range. The ensemble ML model incorporates a residual connection to the gated recurrent unit (GRU) network and adopts the advantage of Transformer, extreme gradient boosting (XGBoost) and GRU with residual connection network, resulting in a 4.1%±1.0% lower root mean square error over XGBoost for the test results. The ensemble model shows great prediction performance, with coefficient of determination of 0.91, 0.86, and 0.77 for 1-hr, 3-hr, and 24-hr averages for the test results, respectively. In particular, this model has achieved excellent performance with low spatial uncertainty in Central, East, and North China, the major site-dense zones. Through the interpretability analysis based on the Shapley value for different temporal resolutions, we found that the contribution of atmospheric chemical processes is more important for hourly predictions compared with the daily scale predictions, while the impact of meteorological conditions would be ever-prominent for the latter. Compared with existing models for different spatiotemporal scales, the present model can be implemented at any air quality monitoring station across China to facilitate achieving rapid and dependable forecast of NO2, which will help developing effective control policies.
科研通智能强力驱动
Strongly Powered by AbleSci AI