堆积
过度拟合
集成学习
计算机科学
遗传算法
人工智能
随机森林
机器学习
选择(遗传算法)
支持向量机
集合预报
模式识别(心理学)
算法
人工神经网络
化学
有机化学
作者
X. Q. Hao,Zhengguang Chen,Shujuan Yi,Jinming Liu
标识
DOI:10.1016/j.chemolab.2023.105020
摘要
Stacking ensemble learning is one of the most effective integration technologies and is increasingly applied to near-infrared spectroscopy combined with chemometrics methods. The prediction accuracy of Stacking is primarily affected by the selection of different models. However, many current studies are mainly artificial selection models' combinations. It affects the model's prediction accuracy and increases the algorithm's difficulty. It is difficult to efficiently and accurately find the optimal configuration scheme. This study applies a genetic algorithm to find the optimal base and meta learner combinations in Stacking ensemble learning. This method uses the near-infrared spectral data set of corn seed germination rate. First, select the best pretreatment methods for seven models, including Gaussian process regression (GPR), SVR, PLS, etc. The above seven single learners after pretreatment are taken as the candidate base learner, and then random forest (RF), SVR, PLS, and GPR are taken as the potential meta learner; use a genetic algorithm to select the optimal model combination configuration and generate GA-Stacking algorithm. The model prediction results of the improved model GA-Stacking are compared with several single models and Stacking ensemble learning via the artificial selection model combinations. The results show that the prediction performance using the GA-Stacking ensemble learning model is optimal, R2 is 0.9022, and RMSE is 0.1100. The experiment shows that the model combination selected by the genetic algorithm has significantly improved the prediction performance of the Stacking ensemble learning model and reduced the risk of the model's overfitting.
科研通智能强力驱动
Strongly Powered by AbleSci AI