期刊:Intelligent Data Analysis [IOS Press] 日期:2019-11-08卷期号:23 (6): 1205-1217被引量:9
标识
DOI:10.3233/ida-184354
摘要
Ensemble learning is an excellent method for imbalance classification. However, the existing ensemble methods often ignore noise in the dataset, which may reduce the accuracy of classifier. In this paper, we propose a density-based undersampling algorithm (DBU) and integrate it with AdaBoost (DBUBo ost) to improve the classification performance. The major contribution of this paper is the development of an undersampling strategy for dealing with both noise and class imbalance problem. We first divide the examples from each class into three categories: useful examples, noise and potentially useful examples. Then we introduce a similarity coefficient to distinguish the examples from each category. Through a selection mechanism based on similarity coefficients, we retain the useful examples and remove the noisy examples. To demonstrate the effectiveness, we compare our DBUBoost with four ensemble methods and three anti-noise methods. The experiments were conducted on 9 KEEL datasets and their noise-modified datasets. Experimental results have shown that our DBUBoost performs better than other state-of-the-art methods.