An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network

计算机科学模式识别（心理学）人工智能融合特征（语言学）比例（比率）图层（电子）计算机视觉材料科学物理语言学哲学量子力学复合材料

作者

Zhong Qu,Le-yuan Gao,Shengye Wang,Haonan Yin,Tuming Yi

出处

期刊：Image and Vision Computing [Elsevier BV]
日期：2022-09-01 卷期号：125: 104518-104518 被引量：9

标识

DOI：10.1016/j.imavis.2022.104518

摘要

SSD and YOLOv5 are the one-stage object detector representative algorithms. An improved one-stage object detector based on the YOLOv5 method is proposed in this paper, named Multi-scale Feature Cross-layer Fusion Network (M-FCFN). Firstly, we extract shallow features and deep features from the PANet structure for cross-layer fusion and obtain a feature scale different from 80 × 80, 40 × 40, and 20 × 20 as output. Then, according to the single shot multi-box detector, we propose the different scale features which are obtained by cross-layer fusion for dimension reduction and use it as another output for prediction. Therefore, two completely different feature scales are added as the output. Features of different scales are necessary for detecting objects of different sizes, which can increase the probability of object detection and significantly improve detection accuracy. Finally, aiming at the Autoanchor mechanism proposed by YOLOv5, we propose an EIOU k-means calculation. We have compared the four model structures of S , M , L , and X of YOLOv5 respectively. The problem of missed and false detections for large objects is improved which has better detection results. The experimental results show that our methods achieve 89.1% and 67.8% mAP @0.5 on the PASCAL VOC and MS COCO datasets. Compared with the YOLOv5_S, our methods improve by 4.4% and 1.4% mAP @ [0.5:0.95] on the PASCAL VOC and MS COCO datasets. Compared with the four models of YOLOv5, our methods have better detection accuracy for large objects. It should be more attention that our method on the large-scale mAP @ [0.5:0.95] is 5.4% higher than YOLOv5_S on the MS COCO datasets. • We proposed Multi-scale Feature Cross-layer Fusion Network (M-FCFN). • Two completely different feature scales are added as the output. • We propose an EIOU k-means Autoanchor calculation. • The problem of missed and false detections for large objects is improved. • Our method on the large-scale mAP @[0.5:0.95] is 5.4% higher than YOLOv5_S.

求助该文献

最长约 10秒，即可获得该文献文件

An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network

今日热心研友