Feature library-assisted surrogate model for evolutionary wrapper-based feature selection and classification

特征选择计算机科学特征（语言学）水准点（测量）人工智能分类器（UML）进化计算模式识别（心理学）灵活性（工程）进化算法人口数据挖掘机器学习数学统计地理社会学大地测量学人口学语言学哲学

作者

Hainan Guo,Junnan Ma,Ruiqi Wang,Yu Zhou

出处

期刊：Applied Soft Computing [Elsevier BV]
日期：2023-03-22 卷期号：139: 110241-110241 被引量：5

标识

DOI：10.1016/j.asoc.2023.110241

摘要

In recent years, wrapper-based feature selection (FS) using evolutionary algorithms has been widely studied due to its ability to search for and evaluate subsets of features based on populations. However, these methods often suffer from a high computational cost and a long computation time, mainly due to the process of evaluating the feature subsets according to the classification performance. In order to tackle this problem, this paper presents a feature library-assisted surrogate model (FL-SM), which aims to reduce the computational cost but maintain a good prediction accuracy. Unlike the existing surrogate models used in FS, the proposed method focuses on the feature level instead of the sample level: an FL is built by collecting the scores of all the features during the evolutionary search. Specifically, each solution (subset candidate) is pre-evaluated based on the FL using only simple operations to decide whether or not it deserves to be evaluated by the classifier, improving the efficiency of the FS algorithm. Meanwhile, because not evaluating a certain number of solutions may lead to inaccurate solution selection during the evolutionary search, dynamic individual selection criteria are proposed. In addition, an adaptive FL update operator is proposed to handle the dynamics of the evolved population; it ensures the real-time validity of the FL. Furthermore, we incorporate the proposed FL-SM into some state-of-the-art single- and multi-objective evolutionary FS methods. The experimental results on benchmark datasets show that with good flexibility and extendibility, FL-SM can effectively reduce the computational cost of wrapper-based FS and still obtain high-quality feature subsets. Among the five algorithms tested, the average computation time reduction was 34.87%; at the same time, there was no significant difference in the classification accuracy for 80% of the tests, and our method even improved the classification accuracy for 6% of the tests.

求助该文献

最长约 10秒，即可获得该文献文件

Feature library-assisted surrogate model for evolutionary wrapper-based feature selection and classification

今日热心研友