计算机科学
遗传程序设计
班级(哲学)
选择(遗传算法)
机器学习
二元分类
维数之咒
人工智能
锦标赛选拔
过程(计算)
遗传算法
适应度函数
二进制数
清晰
数学优化
数据挖掘
数学
支持向量机
生物化学
算术
操作系统
化学
作者
Wenbin Pei,Bing Xue,Lin Shang,Jun Zhang
摘要
High-dimensional unbalanced classification is challenging because of the joint effects of high dimensionality and class imbalance. Genetic programming (GP) has the potential benefits for use in high-dimensional classification due to its built-in capability to select informative features. However, once data are not evenly distributed, GP tends to develop biased classifiers which achieve a high accuracy on the majority class but a low accuracy on the minority class. Unfortunately, the minority class is often at least as important as the majority class. It is of importance to investigate how GP can be effectively utilized for high-dimensional unbalanced classification. In this article, to address the performance bias issue of GP, a new two-criterion fitness function is developed, which considers two criteria, that is, the approximation of area under the curve (AUC) and the classification clarity (i.e., how well a program can separate two classes). The obtained values on the two criteria are combined in pairs, instead of summing them together. Furthermore, this article designs a three-criterion tournament selection to effectively identify and select good programs to be used by genetic operators for generating offspring during the evolutionary learning process. The experimental results show that the proposed method achieves better classification performance than other compared methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI