变压器
卷积神经网络
计算机科学
计算
人工智能
特征提取
嵌入
增采样
学习迁移
多任务学习
模式识别(心理学)
深度学习
工程类
算法
电压
任务(项目管理)
电气工程
图像(数学)
系统工程
作者
Wei Zhu,Hui Zhang,Chao Zhang,Xiaoyang Zhu,Zhen Guan,Jiale Jia
标识
DOI:10.1016/j.aei.2023.102061
摘要
Detecting steel-surface defects is a crucial phase in steel manufacturing; however, accurately completing the detection task is challenging. The Swin Transformer, a self-attention-based model, has shown strong performance in the field of computer vision to enhance the adaptability of the Swin Transformer to the task of steel-surface defect detection, and a new network architecture called the LSwin Transformer is proposed in this study. First, in the downsampling process, we propose a convolutional embedding module and an attention patch merging module, which simultaneously strengthen the connections between the feature map channels, reduce the resolution, and increase image information retention. Second, we propose an effective window shift strategy and a convenient computation approach to make a complete defect between patches have more opportunity to obtain interactive computing. Finally, to combine the feature extraction capability of convolutional neural networks with the global dependency building capability of the Swin Transformer, we propose a depth multilayer perceptron module. Numerous experiments were conducted on a steel-surface defect dataset. The results demonstrated that the detection effect of our model outperformed competing methods, with a mean average precision of 81.2 %. In the ablation study, we verified the effectiveness of each module and initialized the parameters of the model through transfer learning to accelerate the convergence of the model. Therefore, the proposed LSwin Transformer has significant potential for detecting steel-surface defects.
科研通智能强力驱动
Strongly Powered by AbleSci AI