特征选择
子空间拓扑
偏最小二乘回归
随机森林
特征(语言学)
采样(信号处理)
模式识别(心理学)
选择(遗传算法)
计算机科学
分层抽样
人工智能
数据挖掘
回归
数学
统计
机器学习
滤波器(信号处理)
哲学
语言学
计算机视觉
作者
Fang Wang,Suxia Ma,Gaowei Yan
标识
DOI:10.1016/j.chemolab.2023.104926
摘要
In this research, a partial least squares (PLS)-based RF with hybrid feature subspace selection is proposed for regression problems. For the problem that average voting strategy of basic RF may decrease method accuracy, PLS is adopted to automatically assign a voting weight to each tree and aggregate the outputs of all trees. To improve feature subspace selection, stratified sampling and embedded feature selection are integrated. First, the variable importance (VI) of each input feature is obtained through embedded feature selection and the features are categorized into two disjointed sets according to VI. During the construction of the trees, stratified sampling is used for feature subspace selection. The effectiveness of PLS aggregation and hybrid feature selection is respectively validated on six regression datasets. The superiority of the proposed RF is demonstrated on historical operation datasets of two power plants through a comparison with five other models.
科研通智能强力驱动
Strongly Powered by AbleSci AI