均方误差
随机森林
普通最小二乘法
统计
协变量
克里金
预测建模
Lasso(编程语言)
环境科学
自组织映射
数学
计算机科学
人工智能
聚类分析
万维网
作者
Yun Jiang,Fupeng Li,Yufeng Gong,Xiuyuan Yang,Guiting Mu
摘要
ABSTRACT Aims accurately predicting the spatial distribution of soil organic matter (SOM) is essential for environmental management and carbon storage estimation. However, the diversity of sources of variables poses a challenge in studying the spatial distribution of SOM. Methods in order to address this issue, we propose leveraging multiple environmental variables and employing machine learning models, specifically Lightweight gradient boosting machine learning (LightGBM) and random forest (RF), for predicting SOM spatial distribution. 128 soil samples were collected from the Caohai National Nature Reserve, and their SOM content was measured. Results the study found that the average SOM content was 36.75 g/kg. Compared to traditional linear regression models such as ordinary kriging (OK), ordinary least squares (OLS), and geographically weighted regression (GWR), the machine learning models based on nonlinear regression, LightGBM and RF, demonstrated higher cross‐validated coefficients of determination ( R 2 ) of 0.62 and 0.60, respectively, outperforming the other models. Additionally, RF exhibited lower mean absolute error (MAE) and root mean square error (RMSE), indicating higher stability and generalization capability. The spatial distribution of SOM among the models showed consistency, with higher SOM content observed in southern and near‐Caohai Lake regions and lower SOM content in northern and farther regions from Caohai Lake. Results from the Shapley additive explanations (SHAP) model highlighted agricultural land (AL), pH, and Elevation (ELV) as primary covariates influencing SOM spatial distribution. Conclusions this study provides valuable insights and support for environmental management and carbon storage estimation in the karst plateau region.
科研通智能强力驱动
Strongly Powered by AbleSci AI