特征选择
计算机科学
机器学习
人工智能
因果关系(物理学)
特征(语言学)
预处理器
班级(哲学)
选择(遗传算法)
数据挖掘
数据预处理
语言学
量子力学
物理
哲学
作者
Kui Yu,Xianjie Guo,Lin Liu,Jiuyong Li,Hao Wang,Zhaolong Ling,Xindong Wu
摘要
Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this article, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world datasets. Finally, we discuss some challenging problems to be tackled in future research.
科研通智能强力驱动
Strongly Powered by AbleSci AI