信息增益
特征选择
计算机科学
进化算法
特征(语言学)
选择(遗传算法)
人工智能
算法
模式识别(心理学)
机器学习
数据挖掘
哲学
语言学
作者
Baohang Zhang,Ziqian Wang,Haotian Li,Zhenyu Lei,Jiujun Cheng,Shangce Gao
标识
DOI:10.1016/j.ins.2024.120901
摘要
Feature selection (FS) has garnered significant attention because of its pivotal role in enhancing the efficiency and effectiveness of various machine learning and data mining algorithms. Concurrently, multiobjective feature selection (MOFS) algorithms strive to balance the complexity of multiple optimization objectives during the FS process. These include minimizing the number of selected features while maximizing classification performance. Nonetheless, managing the complexity of feature combinations presents a formidable challenge, particularly in high-dimensional datasets. Evolutionary algorithms (EAs) are increasingly adopted in MOFS owing to their exceptional global search capabilities and robustness. Despite their strengths, EAs face difficulties in navigating expansive solution spaces and achieving a balance between exploration and exploitation. To address these challenges, this study introduces a novel information gain-based EA for MOFS, designated as IGEA. This approach utilizes a clustering method for selecting a diverse parent population, thereby enhancing individual variability and maintaining a high-quality population. Considerably, IGEA employs information gain as a metric to evaluate the contribution of features to classification tasks. This metric informs crucial operations such as crossover and mutation. Moreover, the study extensively examines the actual solutions derived from IGEA, focusing on feature correlation and redundancy. This analysis illuminates IGEA's adept handling of these aspects to refine MOFS. Experimental results on 23 widely used classification datasets confirm IGEA's superiority over five other state-of-the-art algorithms, demonstrating its enhanced effectiveness and efficiency in complex MOFS scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI