作者
Shijie Qian,Peng Tian,Zihan Tao,Xi Li,Muhammad Shahzad Nazir,Chu Zhang
摘要
Accurate prediction of air quality is crucial for ensuring the scientific validity and effectiveness of air pollution control measures. This study proposes a combined deep learning (DL) model (XGBoost-GDA-TCN-IMRFO-GRU) for predicting hourly air quality index (AQI) data in four cities. The model integrates Extreme gradient boosting (XGBoost) for feature selection, Gaussian data augmentation (GDA), improved manta ray foraging optimization (IMRFO) algorithm, temporal convolutional network (TCN), and gated recurrent unit (GRU). XGBoost calculates the scores of pollutant gases affecting AQI, selecting the top four important pollutants (PM2.5, PM10, NO2, O3) based on their importance rankings. GDA enhances the robustness of the DL models and addresses the limitations of insufficient and overfitting training datasets. Additionally, the IMRFO algorithm, with two improved strategies, is applied to enhance the GRU model. TCN extracts spatiotemporal features of AQI, while GRU constructs a temporal model for efficient computations. Compared to eleven benchmark models, the proposed model demonstrates superior performance in terms of MAE, RMSE, MAPE, and NSE, achieving high accuracy and optimal prediction performance. Specifically, the XGBoost-GDA-TCN-IMRFO-GRU model reduces RMSE, MAE, and MAPE by 33-60%, 39-68%, and 39-66%, respectively, compared to the TCN model. Therefore, the XGBoost-GDA-TCN-IMRFO-GRU model can provide reliable early warnings for air quality, which is of great significance for air pollution prevention and the sustainable development of society.