计算机科学                        
                
                                
                        
                            特征选择                        
                
                                
                        
                            元启发式                        
                
                                
                        
                            初始化                        
                
                                
                        
                            模式识别(心理学)                        
                
                                
                        
                            排名(信息检索)                        
                
                                
                        
                            数据挖掘                        
                
                                
                        
                            特征(语言学)                        
                
                                
                        
                            人工智能                        
                
                                
                        
                            算法                        
                
                                
                        
                            语言学                        
                
                                
                        
                            哲学                        
                
                                
                        
                            程序设计语言                        
                
                        
                    
            作者
            
                Salima Ouadfel,Mohamed Abd Elaziz            
         
                    
        
    
            
            标识
            
                                    DOI:10.1016/j.eswa.2021.115882
                                    
                                
                                 
         
        
                
            摘要
            
            Feature selection (FS) is an important task in any classification process and aims to choose the smallest features number that yields higher classification accuracy. FS can be formulated as a combinatorial NP-hard problem for which robust metaheuristics are used as efficient wrapper-based FS approaches. However, when applied for high dimensional datasets that present large features number and few samples, the effectiveness of such wrapper-metaheuristics degraded, and their computation costs increased. To tackle this problem, we propose in this paper a hybrid FS approach based on the ReliefF filter method and a novel metaheuristic Equilibrium Optimizer (EO). The proposed method, called RBEO-LS, is composed of two phases. In the first phase, the ReliefF algorithm is used as a preprocessing step to assign weights for features, which estimate their relevance to the classification task. In the second phase, the binary EO (BEO) is used as a wrapper search approach. The features are ranked according to their weights and are used for the initialization of the BEO population. We embedded the BEO with a local search strategy to improve its performance by adding relevant features and removing redundant ones from the features subset guided by the features ranking and the Pearson coefficient correlation. The performance of the developed algorithm has been evaluated on sixteen UCI datasets and ten high dimensional biological datasets. The UCI datasets contain a high number of samples and a small or medium number of features. The biological datasets present a high number of features with few samples. The results demonstrate that the use of the ReliefF algorithm and the local search strategy improves the performance of the EO algorithm. The results also show the superiority of the RBEO-LS, among other state-of-the-art approaches.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI