Classification of Imbalanced Data Using SMOTE and AutoEncoder Based Deep Convolutional Neural Network

人工智能 计算机科学 自编码 过采样 卷积神经网络 深度学习 模式识别(心理学) 机器学习 预处理器 数据预处理 数据集 分类器(UML) 数据挖掘 带宽(计算) 计算机网络
作者
Suja A. Alex,J. Jesu Vedha Nayahi
出处
期刊:International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems [World Scientific]
卷期号:31 (03): 437-469 被引量:9
标识
DOI:10.1142/s0218488523500228
摘要

The imbalanced data classification is a challenging issue in many domains including medical intelligent diagnosis and fraudulent transaction analysis. The performance of the conventional classifier degrades due to the imbalanced class distribution of the training data set. Recently, machine learning and deep learning techniques are used for imbalanced data classification. Data preprocessing approaches are also suitable for handling class imbalance problem. Data augmentation is one of the preprocessing techniques used to handle skewed class distribution. Synthetic Minority Oversampling Technique (SMOTE) is a promising class balancing approach and it generates noise during the process of creation of synthetic samples. In this paper, AutoEncoder is used as a noise reduction technique and it reduces the noise generated by SMOTE. Further, Deep one-dimensional Convolutional Neural Network is used for classification. The performance of the proposed method is evaluated and compared with existing approaches using different metrics such as Precision, Recall, Accuracy, Area Under the Curve and Geometric Mean. Ten data sets with imbalance ratio ranging from 1.17 to 577.87 and data set size ranging from 303 to 284807 instances are used in the experiments. The different imbalanced data sets used are Heart-Disease, Mammography, Pima Indian diabetes, Adult, Oil-Spill, Phoneme, Creditcard, BankNoteAuthentication, Balance scale weight & distance database and Yeast data sets. The proposed method shows an accuracy of 96.1%, 96.5%, 87.7%, 87.3%, 95%, 92.4%, 98.4%, 86.1%, 94% and 95.9% respectively. The results suggest that this method outperforms other deep learning methods and machine learning methods with respect to G-mean and other performance metrics.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
干净的天与完成签到,获得积分10
刚刚
迹K完成签到,获得积分10
1秒前
星光不负赶路人完成签到,获得积分10
1秒前
2秒前
wanwan应助懒羊羊采纳,获得10
3秒前
Akim应助Zhupegnju采纳,获得10
5秒前
钇铷完成签到,获得积分10
6秒前
科研通AI2S应助碑刻采纳,获得10
6秒前
JamesPei应助ShengzhangLiu采纳,获得10
9秒前
小二郎应助浅斟低唱采纳,获得20
9秒前
10秒前
11秒前
13秒前
杨思睿发布了新的文献求助10
13秒前
ding应助Ode采纳,获得10
13秒前
15秒前
haidan发布了新的文献求助10
16秒前
aptamer44完成签到,获得积分10
16秒前
甜甜穆完成签到,获得积分10
17秒前
糊涂的麦片完成签到,获得积分10
18秒前
正月的大雪完成签到,获得积分10
19秒前
20秒前
小马甲应助肉卷采纳,获得10
20秒前
科目三应助haidan采纳,获得10
23秒前
lhr完成签到 ,获得积分10
23秒前
洛河三千星完成签到 ,获得积分10
23秒前
24秒前
Oreki完成签到,获得积分10
25秒前
25秒前
木木发布了新的文献求助10
27秒前
闺音完成签到,获得积分20
27秒前
28秒前
29秒前
风趣青槐发布了新的文献求助10
31秒前
haidan完成签到,获得积分10
33秒前
心理学小五完成签到,获得积分10
33秒前
共享精神应助无奈梦岚采纳,获得10
34秒前
36秒前
37秒前
Rondab应助陈陈采纳,获得10
37秒前
高分求助中
A new approach to the extrapolation of accelerated life test data 1000
Indomethacinのヒトにおける経皮吸収 400
基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 370
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
Robot-supported joining of reinforcement textiles with one-sided sewing heads 320
Aktuelle Entwicklungen in der linguistischen Forschung 300
Current Perspectives on Generative SLA - Processing, Influence, and Interfaces 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3991995
求助须知:如何正确求助?哪些是违规求助? 3533077
关于积分的说明 11260801
捐赠科研通 3272413
什么是DOI,文献DOI怎么找? 1805820
邀请新用户注册赠送积分活动 882665
科研通“疑难数据库(出版商)”最低求助积分说明 809425