核(代数)
失败
卷积(计算机科学)
计算机科学
特征提取
特征(语言学)
人工智能
块(置换群论)
模式识别(心理学)
比例(比率)
像素
对象(语法)
算法
人工神经网络
数学
并行计算
哲学
物理
量子力学
语言学
几何学
组合数学
作者
Haijian Zhao,Haijiang Zhu
标识
DOI:10.23919/ccc58697.2023.10240650
摘要
Most of the existing object detection systems adopt 3x3 convolution kernels for feature extraction, which leads to a problem that the receptive field of the features extraction net is always 3 X • Network features are not rich enough and lack accurate learning of features with pixel size not 3 x • To solve this problem, convolution with different kernel sizes is introduced for feature extraction. However, a large convolution kernel may lead to a rapid increase in the Parameters and FLOPs. In this paper, we propose an object detection network based on depth-wise convolution and multi-scale feature fusion (YOLODM-Net). Specifically, a feature extraction module named multi-scale feature fusion (MSFF) block is constructed, which uses depth-wise convolution of different kernel sizes to extract features and mixes them to enrich learning contents. In addition, we propose a multi-scale spatial attention module based on the Efficient Channel Attention (ECA) module. In this module, multi-scale information is added to make the extracted features more fine-grained. The proposed method was evaluated on the VOC2007 dataset and compared with the previous methods. The mAP of the model is better than that of the YOLOv7, YOLOx, etc. And the Parameters and FLOPs are also improved.
科研通智能强力驱动
Strongly Powered by AbleSci AI