计算机科学
帕斯卡(单位)
增采样
人工智能
目标检测
数据挖掘
模式识别(心理学)
图像(数学)
程序设计语言
作者
Shengye Wang,Zhong Qu,Cui‐Jin Li,Le-yuan Gao
标识
DOI:10.1016/j.engappai.2022.105504
摘要
Improving the detection accuracy and speed for small and multi-object detection is a hot issue in traffic environments. Despite the substantial advances in object detection algorithms based on deep neural networks, addressing the inaccuracy and low efficiency of small and multi-object detection remains challenging. In this paper, we propose a bidirectional attention network called BANet, which includes multichannel attention (MCA) blocks, alpha-effective intersection-over-union (α-EIoU) loss, and a multiple attention fusion (MAF) module. Each MCA block consists of low-layer, medium-layer, and high-layer features to provide rich base information for feature fusion at the neck module. We introduce MAF to alleviate the spatial location loss and poor semantic performance resulting from the continuous downsampling of the path aggregation feature pyramid network (PAFPNet). Finally, α-EIoU is our regression loss module, which calculates the difference between the predicted box and the ground truth (gt) box. Our study further demonstrates that these strategies yield significant improvements in performance over some existing YOLO detectors. Compared with the performance of YOLOX, BANet demonstrates 0.39%–0.52% [email protected] improvement on the PASCAL VOC 2007 (VOC 07) dataset and 0.55%–2.93% [email protected] improvement on the PASCAL VOC 2012 (VOC 12) dataset. Additionally, 0.3%–1.01% improvement in the [email protected] is achieved on the MS COCO 2017 (COCO 17) dataset, indicating that BANet has a significant effect on multi-object detection. Experiments to determine the approximate number of parameters with YOLOX, show that our strategy not only improves by 7.5 frames per second (FPS) but also reduces the Average forward time by 0.97 ms.
科研通智能强力驱动
Strongly Powered by AbleSci AI