特征选择
过度拟合
维数之咒
计算机科学
数据挖掘
冗余(工程)
降维
粒子群优化
特征(语言学)
数据冗余
代谢组学
人工智能
高维数据聚类
模式识别(心理学)
机器学习
生物信息学
生物
人工神经网络
聚类分析
操作系统
语言学
哲学
作者
Mengting Zhang,Jianqiang Du,Bin Nie,Jigen Luo,Ming Liu,Yang Yuan
出处
期刊:PeerJ
[PeerJ]
日期:2024-05-31
卷期号:10: e2073-e2073
标识
DOI:10.7717/peerj-cs.2073
摘要
Metabolomics data has high-dimensional features and a small sample size, which is typical of high-dimensional small sample (HDSS) data. Too high a dimensionality leads to the curse of dimensionality, and too small a sample size tends to trigger overfitting, which poses a challenge to deeper mining in metabolomics. Feature selection is a valuable technique for effectively handling the challenges HDSS data poses. For the feature selection problem of HDSS data in metabolomics, a hybrid Max-Relevance and Min-Redundancy (mRMR) and multi-objective particle swarm feature selection method (MCMOPSO) is proposed. Experimental results using metabolomics data and various University of California, Irvine (UCI) public datasets demonstrate the effectiveness of MCMOPSO in selecting feature subsets with a limited number of high-quality features. MCMOPSO achieves this by efficiently eliminating irrelevant and redundant features, showcasing its efficacy. Therefore, MCMOPSO is a powerful approach for selecting features from high-dimensional metabolomics data with limited sample sizes.
科研通智能强力驱动
Strongly Powered by AbleSci AI