计算机科学
特征选择
区间(图论)
特征(语言学)
选择(遗传算法)
区间数据
统计
模式识别(心理学)
数据挖掘
数学
人工智能
组合数学
语言学
哲学
度量(数据仓库)
作者
Xiaobo Qi,Jinyu Song,Hui Qi,Ying Shi
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:12: 53752-53766
被引量:2
标识
DOI:10.1109/access.2024.3387978
摘要
The feature selection for interval-valued data(IVD) aims to identify representative features from a large set of features, which can reduce the model complexity, minimize the training time, and enhance the generalization ability of the model.Addressing the inter-feature correlations in IVD, we propose a feature selection method called the maximum information coefficient for interval-valued data(IVD_MIC).First, the method balances the relationship between the midpoint and radius of IVD with an adjustment factor, constructing the interval-valued data unified representation frame (URF).Based on the URF, the method measures the degree of correlation between two features by calculating the maximum information coefficient, and obtains the maximum information coefficient matrix for IVD.Then the features with strong correlation are progressively removed from three perspectives(row, column, and both row and column), generating a series of corresponding candidate feature subsets.Finally, IVD_MIC is validated on candidate feature subsets to obtain the final classification accuracy and optimal feature subset.The experiment results on synthetic and real-world datasets with different classifiers demonstrate that the overall performance of IVD_MIC surpasses other methods.The average accuracy of IVD_MIC is higher, improving by 0.23%, 0.53% and 0.45% compared to the second-best method on LIBSVM, CART Tree and KNN, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI