均方误差
偏最小二乘回归
采样(信号处理)
数学
统计
土壤有机质
环境科学
土壤科学
计算机科学
模式识别(心理学)
人工智能
土壤水分
计算机视觉
滤波器(信号处理)
作者
Hongyi Li,Yuheng Li,Mingyong Yang,Songchao Chen,Zhou Shi
摘要
Abstract Soil function degradation threatens the sustainable management of soil resources and soil organic matter (SOM) is a vital and important factor. Powerful measuring tools will become very important, especially in areas where data are poor or absent. The archive: China Soil Visible and Near Infrared (vis–NIR) Spectroscopy Library (CSSL) could help providea solution for less costly and fast measuring of SOM. The aim of this article was to compare SOM prediction performance according to three strategies: i) general global partial least squares regression (PLSR) using CSSL with and without spiking samples; ii) memory‐based learning (MBL) using CSSL with and without spiking samples; and iii) general PLSR using only spiking samples to predict soil organic matter in the target area. When using spiked subsets, we also investigated the prediction performance of the extra‐weighted (several copies) subsets. A series of spiking subsets were randomly selected from the total spiking samples, which were selected by conditioned Latin hypercube sampling (cLHS) from the target sites. We calculated only the mean squared Euclidean distance (msd) between the estimates density function (pds) of the principal components (PCs) of vis–NIR spectroscopy from the validation dataset and spiking subsets and statistically inferred the optimal sampling set size to be 30. Our study showed that global PLSR using CSSL spiked with the statistically optimal local samples can achieve higher predicted performance [with a mean root mean square error (RMSE) of 5.75]. MBL spiked with five extra‐weighted optimal spiking samples achieved the best accuracy with an RMSE of 3.98, an R 2 of 0.70, a bias of 0.04, and an LCCC of 0.81. The msd is a simple and effective method to determine an adequate spiking set size using only vis–NIR data. These accurate predictions demonstrated the usefulness of statistically representative spiking and MBL for advanced large soil spectral libraries for SOM determination, which is currently lacking at large soil spectral libraries in use.
科研通智能强力驱动
Strongly Powered by AbleSci AI