The use of generative adversarial networks to alleviate class imbalance in tabular data: a survey

计算机科学 人气 分类器(UML) 机器学习 班级(哲学) 对抗制 人工智能 生成语法 生成对抗网络 基线(sea) 多样性(控制论) 数据科学 数据挖掘 深度学习 海洋学 地质学 社会心理学 心理学
作者
Rick Sauber-Cole,Taghi M. Khoshgoftaar
出处
期刊:Journal of Big Data [Springer Nature]
卷期号:9 (1) 被引量:33
标识
DOI:10.1186/s40537-022-00648-6
摘要

Abstract The existence of class imbalance in a dataset can greatly bias the classifier towards majority classification. This discrepancy can pose a serious problem for deep learning models, which require copious and diverse amounts of data to learn patterns and output classifications. Traditionally, data-level and algorithm-level techniques have been instrumental in mitigating the adverse effect of class imbalance. With the recent development and proliferation of Generative Adversarial Networks (GANs), researchers across a variety of disciplines have adapted the architecture of GANs and implemented them on imbalanced datasets to generate instances of the underrepresented class(es). Though the bulk of research has been centered on the application of this methodology in computer vision tasks, GANs are likewise being appropriated for use in tabular data, or data consisting of rows and columns with traditional structured data types. In this survey paper, we assess the methodology and efficacy of these modifications on tabular datasets, across domains such network traffic classification and financial transactions over the past seven years. We examine what methodologies and experimental factors have resulted in the greatest machine learning efficacy, as well as the research works and frameworks which have proven most influential in the development of the application of GANs in tabular data settings. Specifically, we note the prevalence of the CGAN architecture, the optimality of novel methods with CNN learners and minority-class sensitive measures such as F1 score, the popularity of SMOTE as a baseline technique, and the improved performance in the year-over-year use of GANs in imbalanced tabular datasets.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
3秒前
4秒前
CC完成签到,获得积分10
5秒前
6秒前
7秒前
8秒前
8秒前
8秒前
孙小雨发布了新的文献求助10
9秒前
10秒前
ll发布了新的文献求助10
10秒前
10秒前
11秒前
都是发布了新的文献求助10
11秒前
samo完成签到,获得积分10
13秒前
13秒前
WJ发布了新的文献求助10
14秒前
爹爹发布了新的文献求助10
14秒前
16秒前
Ian发布了新的文献求助10
17秒前
上官完成签到,获得积分10
18秒前
19秒前
FashionBoy应助leo采纳,获得10
21秒前
21秒前
彭于晏应助紫色奶萨采纳,获得10
21秒前
上官发布了新的文献求助10
22秒前
NexusExplorer应助落后忆丹采纳,获得50
24秒前
欢喜发卡发布了新的文献求助10
24秒前
28秒前
小马甲应助英勇的鱼采纳,获得10
29秒前
顾矜应助研究畜采纳,获得10
29秒前
heal发布了新的文献求助100
30秒前
31秒前
35秒前
外向半青完成签到,获得积分10
35秒前
冯兴龙完成签到,获得积分10
35秒前
36秒前
科研小白发布了新的文献求助20
36秒前
ll完成签到,获得积分10
37秒前
科研通AI2S应助1234采纳,获得10
38秒前
高分求助中
Evolution 10000
Sustainability in Tides Chemistry 2800
The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000
юрские динозавры восточного забайкалья 800
English Wealden Fossils 700
Foreign Policy of the French Second Empire: A Bibliography 500
Chen Hansheng: China’s Last Romantic Revolutionary 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3146297
求助须知:如何正确求助?哪些是违规求助? 2797687
关于积分的说明 7825144
捐赠科研通 2454059
什么是DOI,文献DOI怎么找? 1305990
科研通“疑难数据库(出版商)”最低求助积分说明 627630
版权声明 601503