Imbalanced data enhancement method based on improved DCGAN and its application

鉴别器人工智能计算机科学模式识别（心理学）过度拟合理论（学习稳定性）卷积（计算机科学）卷积神经网络规范化（社会学）样品（材料）人工神经网络数学机器学习电信探测器社会学色谱法化学人类学

作者

Lijun Zhang,Lixiang Duan,Xiaocui Hong,Xiangyu Liu,Xinyun Zhang

出处

期刊：Journal of Intelligent and Fuzzy Systems [IOS Press]
日期：2021-09-15 卷期号：41 (2): 3485-3498 被引量：9

标识

摘要

Machinery operates well under normal conditions in most cases; far fewer samples are collected in a fault state (minority samples) than in a normal state, resulting in an imbalance of samples. Common machine learning algorithms such as deep neural networks require a significant amount of data during training to avoid overfitting. These models often fail to detect minority samples when the input samples are imbalanced, which results in missed diagnoses of equipment faults. As an effective method to enhance minority samples, Deep Convolution Generative Adversarial Network (DCGAN) does not fundamentally address the problem of unstable Generative Adversarial Network (GAN) training. This study proposes an improved DCGAN model with improved stability and sample balance for achieving greater classification accuracy over minority samples. First, spectral normalization is performed on each convolutional layer, improving stability in the DCGAN discriminator. Then, the improved DCGAN model is trained to generate new samples that are different from the original samples but with a similar distribution when the Nash equilibrium is reached. Four indices—Inception Score (IS), Fréchet Inception Distance Score (FID), Peak Signal to Noise Ratio (PSNR), and Structural Similarity (SSIM)—were used to quantitatively evaluate of the generated images. Finally, the Balance Degree of Samples (BDS) index was proposed, and the new samples are proportionally added to the original samples to improve sample balance, resulting in the formation of several groups of datasets with different balance degrees, and Convolutional Neural Network (CNN) models are used to classify these samples. With experimental analysis on the reciprocating compressor, the variance of lost data is found to be less than 1% of the original value, representing an increase in stabilityof the model to generate diverse and high-quality sample images, as compared with that of the unmodified model. The classification accuracy exceeds 95% and tends to remain stable when the balance degree of samples is greater than 80%.

求助该文献

最长约 10秒，即可获得该文献文件

Imbalanced data enhancement method based on improved DCGAN and its application

今日热心研友