亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

An Ensemble Learning Approach with Gradient Resampling for Class-Imbalance Problems

重采样 Boosting(机器学习) 计算机科学 机器学习 人工智能 集成学习 采样(信号处理) 班级(哲学) 样品(材料) 集合(抽象数据类型) 滤波器(信号处理) 算法 计算机视觉 色谱法 化学 程序设计语言
作者
Hongke Zhao,Chuang Zhao,Xi Zhang,Nanlin Liu,Hengshu Zhu,Qi Liu,Hui Xiong
出处
期刊:Informs Journal on Computing 卷期号:35 (4): 747-763 被引量:9
标识
DOI:10.1287/ijoc.2023.1274
摘要

Imbalanced classification is widely referred in many real-world applications and has been extensively studied. Most existing algorithms consider alleviating the imbalance by sampling or guiding ensemble learners with punishments. The combination of ensemble learning and sampling strategy at class level has achieved great progress. Actually, specific hard examples have little benefit for model learning and even degrade the performance. From the view of identifying classification difficulty of samples, one important motivation is to design algorithms to finely equip different samples with progressive learning. Unfortunately, how to perfectly configure the sampling and learning strategies under ensemble principles at the sample level remains a research gap. In this paper, we propose a new view from the sample level rather than class level in existing studies. We design an ensemble approach in pipe with sample-level gradient resampling, that is, balanced cascade with filters (BCWF). Before that, as a preliminary exploration, we first design a hard examples mining algorithm to explore the gradient distribution of classification difficulty of samples and identify the hard examples. Specifically, BCWF uses an under-sampling strategy and a boosting manner to train T predictive classifiers and reidentify hard examples. In BCWF, moreover, we design two types of filters: the first is assembled with a hard filter (BCWF_h), whereas the second is assembled with a soft filter (BCWF_s). In each round of boosting, BCWF_h strictly removes a gradient/set of the hardest examples from both classes, whereas BCWF_s removes a larger number of harder and easy examples simultaneously for final balanced-class retention. Consequently, the well-trained T predictive classifiers can be used with two ensemble voting strategies: average probability and majority vote. To evaluate the proposed approach, we conduct intensive experiments on 10 benchmark data sets and apply our algorithms to perform default user detection on a real-world peer to peer lending data set. The experimental results fully demonstrate the effectiveness and the managerial implications of our approach when compared with 11 competitive algorithms. History: Accepted by Ram Ramesh, Area Editor for Data Science & Machine Learning. Funding: This work was supported by the National Natural Science Foundation of China [Grants 72101176, 71722005, and 72241432], the National Key R&D program of China [Grant 2020YFA0908600] and the Natural Science Foundation of Tianjin City [Grant 18JCJQJC45900]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2023.1274 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2021.0104 ) at ( http://dx.doi.org/10.5281/zenodo.6360996 ).

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
2秒前
Orange应助科研通管家采纳,获得10
2秒前
2秒前
2秒前
9秒前
17秒前
FashionBoy应助harlind采纳,获得10
19秒前
秋作完成签到,获得积分10
20秒前
小叉叉搞快点完成签到 ,获得积分10
21秒前
Leofar完成签到 ,获得积分10
23秒前
研时友发布了新的文献求助10
23秒前
24秒前
Zero完成签到 ,获得积分10
28秒前
火星上唇膏完成签到 ,获得积分10
28秒前
32秒前
orixero应助arizaki7采纳,获得10
32秒前
量子星尘发布了新的文献求助10
33秒前
harlind完成签到,获得积分10
34秒前
旺仔发布了新的文献求助30
38秒前
咎不可完成签到,获得积分10
38秒前
吴彦祖完成签到,获得积分10
39秒前
44秒前
大头完成签到 ,获得积分10
48秒前
arizaki7发布了新的文献求助10
49秒前
54秒前
56秒前
美好丹雪发布了新的文献求助10
56秒前
57秒前
lww发布了新的文献求助10
1分钟前
1分钟前
爆米花应助lucky采纳,获得10
1分钟前
Wawoo完成签到,获得积分10
1分钟前
一二发布了新的文献求助10
1分钟前
胖胖橘发布了新的文献求助10
1分钟前
1分钟前
科研通AI6.3应助一二采纳,获得10
1分钟前
咸鱼之王发布了新的文献求助10
1分钟前
1分钟前
FashionBoy应助咸鱼之王采纳,获得10
1分钟前
高分求助中
Hope Teacher Rating Scale 1000
Entre Praga y Madrid: los contactos checoslovaco-españoles (1948-1977) 1000
Polymorphism and polytypism in crystals 1000
Encyclopedia of Materials: Plastics and Polymers 800
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
Death Without End: Korea and the Thanatographics of War 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6094075
求助须知:如何正确求助?哪些是违规求助? 7924026
关于积分的说明 16404947
捐赠科研通 5225244
什么是DOI,文献DOI怎么找? 2793092
邀请新用户注册赠送积分活动 1775720
关于科研通互助平台的介绍 1650258