化学
堆积
数据预处理
预处理器
集成学习
回归
数据挖掘
模式识别(心理学)
人工智能
统计
有机化学
数学
计算机科学
作者
Haowen Huang,Zile Fang,Yuelong Xu,Guosheng Lu,Can Feng,Min Zeng,Jiaju Tian,Yongfu Ping,Zhuolin Han,Zhigang Zhao
出处
期刊:Talanta
[Elsevier]
日期:2024-05-11
卷期号:276: 126242-126242
被引量:1
标识
DOI:10.1016/j.talanta.2024.126242
摘要
Spectral preprocessing techniques can, to a certain extent, eliminate irrelevant information, such as current noise and stray light from spectral data, thereby enhancing the performance of prediction models. However, current preprocessing techniques mostly attempt to find the best single preprocessing method or their combination, overlooking the complementary information among different preprocessing methods. These preprocessing techniques fail to maximize the utilization of useful information in spectral data and restrict the performance of prediction models. This study proposed a spectral ensemble preprocessing method based on the rapidly developing ensemble learning methods in recent years and the ridge regression (RR) model, named stacking preprocessing ridge regression (SPRR), to address the aforementioned issues. Different from conventional ensemble learning methods, the proposed SPRR method applied multiple different preprocessing techniques to the original spectral data, generating multiple preprocessed datasets. These datasets were then individually inputted into RR base models for training. Ultimately, RR still served as the meta-model, integrating the output results of each RR base model through stacking. This approach not only produced diversity in base models but also achieved higher accuracy and lower computational complexity by using a single type of base model. On the apple spectral dataset collected by our team, correlation analysis showed significant complementary information among the data produced by different preprocessing techniques. This provided robust theoretical support for the proposed SPRR method. By introducing the currently popular averaging ensemble preprocessing method in a comparative experiment, the results of applying the proposed SPRR method to six datasets (apple, meat, wheat, olive oil, tablet, and corn) demonstrated that compared to the single preprocessing method and averaging ensemble preprocessing method, SPRR yielded the best accuracy and reliability for all six datasets. Furthermore, under the same conditions of the training and test datasets, the proposed SPRR method demonstrated better performance than the four commonly used ensemble preprocessing methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI