过采样
计算机科学
机器学习
人工智能
班级(哲学)
生成语法
生成对抗网络
对抗制
加入
任务(项目管理)
数据挖掘
算法
模式识别(心理学)
深度学习
经济
管理
程序设计语言
带宽(计算)
计算机网络
作者
Georgios Douzas,Fernando Bação
标识
DOI:10.1016/j.eswa.2017.09.030
摘要
Learning from imbalanced datasets is a frequent but challenging task for standard classification algorithms. Although there are different strategies to address this problem, methods that generate artificial data for the minority class constitute a more general approach compared to algorithmic modifications. Standard oversampling methods are variations of the SMOTE algorithm, which generates synthetic samples along the line segment that joins minority class samples. Therefore, these approaches are based on local information, rather on the overall minority class distribution. Contrary to these algorithms, in this paper the conditional version of Generative Adversarial Networks (cGAN) is used to approximate the true data distribution and generate data for the minority class of various imbalanced datasets. The performance of cGAN is compared against multiple standard oversampling algorithms. We present empirical results that show a significant improvement in the quality of the generated data when cGAN is used as an oversampling algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI