Granular Ball Sampling for Noisy Label Classification or Imbalanced Classification
计算机科学
模式识别(心理学)
多标签分类
球(数学)
人工智能
数学
数学分析
作者
Shuyin Xia,Shaoyuan Zheng,Guoyin Wang,Xinbo Gao,Binggui Wang
出处
期刊:IEEE transactions on neural networks and learning systems [Institute of Electrical and Electronics Engineers] 日期:2021-08-30卷期号:34 (4): 2144-2155被引量:76
This article presents a general sampling method, called granular-ball sampling (GBS), for classification problems by introducing the idea of granular computing. The GBS method uses some adaptively generated hyperballs to cover the data space, and the points on the hyperballs constitute the sampled data. GBS is the first sampling method that not only reduces the data size but also improves the data quality in noisy label classification. In addition, because the GBS method can be used to exactly describe the boundary, it can obtain almost the same classification accuracy as the results on the original datasets, and it can obtain an obviously higher classification accuracy than random sampling. Therefore, for the data reduction classification task, GBS is a general method that is not especially restricted by any specific classifier or dataset. Moreover, the GBS can be effectively used as an undersampling method for imbalanced classification. It has a time complexity that is close to O( $N$ ), so it can accelerate most classifiers. These advantages make GBS powerful for improving the performance of classifiers. All codes have been released in the open source GBS library at http://www.cquptshuyinxia.com/GBS.html .