特征选择
聚类分析
模式识别(心理学)
特征(语言学)
维数之咒
人工智能
计算机科学
降维
人口
最小冗余特征选择
数据挖掘
哲学
语言学
人口学
社会学
作者
Peng Wang,Bing Xue,Jing Liang,Mengjie Zhang
标识
DOI:10.1016/j.patcog.2023.109523
摘要
Modern data collection technologies may produce thousands of or even more features in a single dataset. The high dimensionality of data poses a barrier to determining discriminating features due to the curse of dimensionality. Thanks to the global search ability, many population-based feature selection approaches have been proposed. However, very few studies pay attention on that a feature selection task has multiple optimal feature subsets. To search for multiple optimal feature subsets, we propose a feature clustering-assisted feature selection method. The proposed method employs the knowledge of correlation measures to group features. And, this correlation knowledge is embedded into the encoding method and the search process. A niching-based mutation operator is also used to explore the vicinity of a target individual. The aim is to find different feature subsets with very similar or the same classification performance. In addition, a modification operator is proposed aiming to increase the population diversity to improve the feature selection performance. The experiments on 16 datasets show that the proposed algorithm outperforms other popular feature selection methods in terms of classification accuracy and feature subset size.
科研通智能强力驱动
Strongly Powered by AbleSci AI