特征选择
计算机科学
范畴变量
数据挖掘
特征(语言学)
滤波器(信号处理)
人工智能
模式识别(心理学)
选择(遗传算法)
最小冗余特征选择
主成分分析
降维
机器学习
哲学
语言学
计算机视觉
作者
Chih‐Wen Chen,Yi‐Hong Tsai,Fang‐Rong Chang,Wei‐Chao Lin
摘要
Abstract Feature selection is a process aimed at filtering out unrepresentative features from a given dataset, usually allowing the later data mining and analysis steps to produce better results. However, different feature selection algorithms use different criteria to select representative features, making it difficult to find the best algorithm for different domain datasets. The limitations of single feature selection methods can be overcome by the application of ensemble methods, combining multiple feature selection results. In the literature, feature selection algorithms are classified as filter, wrapper, or embedded techniques. However, to the best of our knowledge, there has been no study focusing on combining these three types of techniques to produce ensemble feature selection. Therefore, the aim here is to answer the question as to which combination of different types of feature selection algorithms offers the best performance for different types of medical data including categorical, numerical, and mixed data types. The experimental results show that a combination of filter (i.e., principal component analysis) and wrapper (i.e., genetic algorithms) techniques by the union method is a better choice, providing relatively high classification accuracy and a reasonably good feature reduction rate.
科研通智能强力驱动
Strongly Powered by AbleSci AI