欠采样
计算机科学
分类器(UML)
噪音(视频)
人工智能
模式识别(心理学)
集成学习
相似性(几何)
阿达布思
机器学习
班级(哲学)
多类分类
数据挖掘
算法
支持向量机
图像(数学)
作者
Yun Hou,Li Li,Bailin Li,Jiajia Liu
出处
期刊:Intelligent Data Analysis
[IOS Press]
日期:2019-11-08
卷期号:23 (6): 1205-1217
被引量:9
摘要
Ensemble learning is an excellent method for imbalance classification. However, the existing ensemble methods often ignore noise in the dataset, which may reduce the accuracy of classifier. In this paper, we propose a density-based undersampling algorithm (DBU) and integrate it with AdaBoost (DBUBo ost) to improve the classification performance. The major contribution of this paper is the development of an undersampling strategy for dealing with both noise and class imbalance problem. We first divide the examples from each class into three categories: useful examples, noise and potentially useful examples. Then we introduce a similarity coefficient to distinguish the examples from each category. Through a selection mechanism based on similarity coefficients, we retain the useful examples and remove the noisy examples. To demonstrate the effectiveness, we compare our DBUBoost with four ensemble methods and three anti-noise methods. The experiments were conducted on 9 KEEL datasets and their noise-modified datasets. Experimental results have shown that our DBUBoost performs better than other state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI