环境科学
微粒
共线性
空气污染
空间变异性
随机森林
回归分析
线性回归
土地利用
气象学
统计
地理
数学
计算机科学
机器学习
生态学
生物
作者
Xinyi Song,Ya Gao,Yubo Peng,Sen Huang,Chao Liu,Zhong‐Ren Peng
标识
DOI:10.1177/2399808320975031
摘要
It is challenging to forecast high-resolution spatial-temporal patterns of intra-urban air pollution and identify impacting factors at the regional scale. Studies have attempted to capture features of air pollutants such as fine particulate matter (PM 2.5 ) and nitrogen dioxide (NO 2 ) using land use regression models, but this method overlooks the multi-collinearity of factors, non-linear correlations between factors and air pollutants, and it fails to perform well when processing daily data. However, machine learning is a feasible approach for establishing persuasive intra-urban air pollution daily variation models. In this article, random forest is utilised to establish intra-urban PM 2.5 and NO 2 spatial-temporal variation models and is compared to the traditional land use regression method. Taking the city of Shanghai, China as the case area, 36 station-measured daily records in two and a half years of PM 2.5 and NO 2 concentrations were collected. And over 80 different predictors associated with meteorological and geographical conditions, transportation, community population density, land use and points of interest are used to construct the land use regression and random forest models. Results from the two methods are compared and impacting factors identified. Explained variance ( R 2 ) is used to quantify and compare model performance. The final land use regression model explains 49.3% and 42.2% of the spatial variation in ambient PM 2.5 and NO 2 , respectively, whereas the random forest model explains 78.1% and 60.5% of the variance. Regression mappings for unsampled sites on a grid pattern of 1 km × 1 km are also implemented. The random forest model is shown to perform much better than the land use regression model. In general, the findings suggest that the random forest approach offers a robust improvement in predicting performance compared to the land use regression model in estimating daily spatial variations in ambient PM 2.5 and NO 2 .
科研通智能强力驱动
Strongly Powered by AbleSci AI