计算机科学
特征选择
冗余(工程)
算法
二进制数
人工智能
模式识别(心理学)
特征(语言学)
相互信息
排名(信息检索)
数据挖掘
数学
语言学
算术
操作系统
哲学
作者
Zohre Sadeghian,Ebrahim Akbari,Hossein Nematzadeh
标识
DOI:10.1016/j.engappai.2020.104079
摘要
Feature selection is the problem of finding the optimal subset of features for predicting class labels by removing irrelevant or redundant features. S-shaped Binary Butterfly Optimization Algorithm (S-bBOA) is a nature-inspired algorithm for solving the feature selection problems. The evidence shows that S-bBOA has a better performance in exploration, exploitation, convergence, and avoidance of getting stuck in local optimal compared to other optimization algorithms. However, S-bBOA does not consider redundancy and relevancy of features. This paper proposes Information Gain binary Butterfly Optimization Algorithm (IG-bBOA), to overcome the S-bBOA constraints firstly. IG-bBOA maximizes both the classification accuracy and the mean of the mutual information between features and class labels. In addition, IG-bBOA also tries to minimize the number of selected features and is used within a three-phase proposed method called Ensemble Information Theory based binary Butterfly Optimization Algorithm (EIT-bBOA). In the first phase, 80% of irrelevant and redundant features are removed using Minimal Redundancy-Maximal New Classification Information (MR-MNCI) feature selection. In the second phase, the best feature subset is selected using IG-bBOA. Finally, a similarity based ranking method is used to select the final features subset. The experimental results are conducted using six standard datasets from UCI repository. The findings confirm the efficiency of the proposed method in improving the classification accuracy and selecting the best optimal features subset with minimum number of feature in most cases.
科研通智能强力驱动
Strongly Powered by AbleSci AI