Jie Wang,Guoqiang Li,GuanWen Qiu,Gang Ma,Jinwen Xi,Nana Yu
出处
期刊:IEEE Transactions on Intelligent Transportation Systems [Institute of Electrical and Electronics Engineers] 日期:2024-04-24卷期号:25 (7): 8042-8052被引量:1
标识
DOI:10.1109/tits.2024.3387949
摘要
Visual-based methods for rail surface defect inspection (RSDI) effectively improve the limitations of manual inspection, as they can intuitively display the locations and segmented areas of sensitive defects. The RGB-D RSDI task, which leverages the complementarity between RGB and depth (D) image information to enhance detection performance, has attracted widespread attention and achieved significant development. However, existing methods primarily depend on fully supervised training strategies that necessitate a substantial number of manually annotated pixel-level labels to supervise model training. Undoubtedly, extensive manual annotation is exceedingly time-consuming and labor-intensive, particularly considering the irregular shapes and textures of surface defects on rails, further compounding the burden of manual labeling. Therefore, in this paper, we aim to introduce the semi-supervised learning paradigm into this task. Towards the semi-supervised RGB-D RSDI task, a specific semi-supervised network for this task and an effective cross-modal fusion module are crucial to ensuring detection performance under the constraints of limited labeled samples. Thus, we propose a Depth-assisted Semi-Supervised RGB-D RSDI network (DSSNet) to simultaneously alleviate the annotation burden and achieve satisfactory detection performance. Specifically, adhering to the consistency training paradigm, we construct a semi-supervised RGB-D RSDI architecture for this task by optimizing structures, perturbation mechanisms, loss settings, etc. Furthermore, we propose a Depth-assisted Multi-scale Cross-modal Fusion Module (DMCFM) that conducts multi-scale exploration and cross-modal complementary fusion with the assistance of depth. Comprehensive experiments demonstrate that, compared to the latest 14 state-of-the-art fully supervised methods, the proposed DSSNet achieves highly competitive results while effectively alleviating an 80 $\%$ annotation burden.