特征选择
计算机科学
特征(语言学)
水准点(测量)
人工智能
分类器(UML)
进化计算
模式识别(心理学)
灵活性(工程)
进化算法
人口
数据挖掘
机器学习
数学
统计
地理
社会学
大地测量学
人口学
语言学
哲学
作者
Hainan Guo,Jie Ma,Ruiqi Wang,Yu Zhou
标识
DOI:10.1016/j.asoc.2023.110241
摘要
In recent years, wrapper-based feature selection (FS) using evolutionary algorithms has been widely studied due to its ability to search for and evaluate subsets of features based on populations. However, these methods often suffer from a high computational cost and a long computation time, mainly due to the process of evaluating the feature subsets according to the classification performance. In order to tackle this problem, this paper presents a feature library-assisted surrogate model (FL-SM), which aims to reduce the computational cost but maintain a good prediction accuracy. Unlike the existing surrogate models used in FS, the proposed method focuses on the feature level instead of the sample level: an FL is built by collecting the scores of all the features during the evolutionary search. Specifically, each solution (subset candidate) is pre-evaluated based on the FL using only simple operations to decide whether or not it deserves to be evaluated by the classifier, improving the efficiency of the FS algorithm. Meanwhile, because not evaluating a certain number of solutions may lead to inaccurate solution selection during the evolutionary search, dynamic individual selection criteria are proposed. In addition, an adaptive FL update operator is proposed to handle the dynamics of the evolved population; it ensures the real-time validity of the FL. Furthermore, we incorporate the proposed FL-SM into some state-of-the-art single- and multi-objective evolutionary FS methods. The experimental results on benchmark datasets show that with good flexibility and extendibility, FL-SM can effectively reduce the computational cost of wrapper-based FS and still obtain high-quality feature subsets. Among the five algorithms tested, the average computation time reduction was 34.87%; at the same time, there was no significant difference in the classification accuracy for 80% of the tests, and our method even improved the classification accuracy for 6% of the tests.
科研通智能强力驱动
Strongly Powered by AbleSci AI