多重共线性
特征选择
冗余(工程)
计算机科学
数据挖掘
排名(信息检索)
相关性(法律)
模式识别(心理学)
人工智能
特征(语言学)
选择(遗传算法)
机器学习
投影(关系代数)
回归分析
算法
哲学
操作系统
语言学
政治学
法学
作者
Azlyna Senawi,Hua‐Liang Wei,S.A. Billings
标识
DOI:10.1016/j.patcog.2017.01.026
摘要
A substantial amount of datasets stored for various applications are often high dimensional with redundant and irrelevant features. Processing and analysing data under such circumstances is time consuming and makes it difficult to obtain efficient predictive models. There is a strong need to carry out analyses for high dimensional data in some lower dimensions, and one approach to achieve this is through feature selection. This paper presents a new relevancy-redundancy approach, called the maximum relevance–minimum multicollinearity (MRmMC) method, for feature selection and ranking, which can overcome some shortcomings of existing criteria. In the proposed method, relevant features are measured by correlation characteristics based on conditional variance while redundancy elimination is achieved according to multiple correlation assessment using an orthogonal projection scheme. A series of experiments were conducted on eight datasets from the UCI Machine Learning Repository and results show that the proposed method performed reasonably well for feature subset selection.
科研通智能强力驱动
Strongly Powered by AbleSci AI