特征选择
过度拟合
计算机科学
人工智能
机器学习
特征(语言学)
分类器(UML)
遗传算法
选择(遗传算法)
随机森林
最小冗余特征选择
离群值
模式识别(心理学)
人工神经网络
哲学
语言学
作者
Abu Bakar Siddique,Muhammad Abu Bakar,Raja Hashim Ali,Usama Arshad,Nisar Ali,Zain ul Abideen,Talha Ali Khan,Ali Zeeshan Ijaz,Muhammad Imad
标识
DOI:10.1109/icit59216.2023.10335842
摘要
Feature selection is a critical factor affecting the performance of optimization algorithms. Without proper feature selection, optimization algorithms may suffer from slow convergence, overfitting, increased computational requirements, and longer execution times. On the other hand, omitting important features can lead to loss of relevant information, decreased accuracy, bias, and increased vulnerability to noise and outliers. This study investigates the use of genetic algorithms as a feature selection technique for a classification problem, specifically the mushrooms classification problem. Random forest is employed as the machine learning classifier, and genetic algorithms are compared with correlation as the feature selection method. The results show that genetic algorithms achieve higher accuracy, precision, recall, and F1-score compared to correlation-based feature selection. However, genetic algorithms have limitations in their applicability to specific optimization problems, the need for proper parameter setup, and longer convergence times. Despite these drawbacks, genetic algorithms prove to be superior to other feature selection techniques, particularly correlation-based approaches. This study highlights the importance of selecting appropriate feature selection techniques for optimization algorithms to improve their performance and achieve better results. In addition, this study explored the performance of various machine learning approaches on the complete mushroom dataset with 22 features and shows that genetic algorithms with feature selection as the most accurate method.
科研通智能强力驱动
Strongly Powered by AbleSci AI