特征选择
计算机科学
聚类分析
冗余(工程)
最小冗余特征选择
特征(语言学)
流式数据
人工智能
数据挖掘
水准点(测量)
模式识别(心理学)
特征向量
选择(遗传算法)
机器学习
操作系统
哲学
地理
语言学
大地测量学
标识
DOI:10.1109/icnsc58704.2023.10319011
摘要
Streaming feature selection (SFS), which enters the feature space in the form of flow and plays an important role in processing high dimensional data. Online streaming feature selection (OSFS) is a real-time streaming feature selection method. Most of the existing online streaming feature selection methods are based on the relevance between feature and label, ignoring the lack of label information, which frequently results in the failure of feature selection. In addition, in numerous practical application scenarios like intelligent medical treatment, label information with the high cost of acquiring label information is missing, which poses a great challenge to feature selection. To solve these problems, a novel unsupervised online streaming feature selection algorithm with density peak clustering (UOSFS-DPC) is proposed to reduce feature redundancy by clustering online streaming features. An online unsupervised relevance analysis and online unsupervised redundancy analysis are also adopted to select important feature subsets from the streaming features. Experimental results on six benchmark datasets and compared studies with eight representative SFS algorithms demonstrate that UOSFS-DPC outperforms its peers when label information is unknown.
科研通智能强力驱动
Strongly Powered by AbleSci AI