棱锥(几何)
特征(语言学)
人工智能
计算机科学
对象(语法)
目标检测
模式识别(心理学)
计算机视觉
数学
语言学
几何学
哲学
作者
Honggui Han,Qiyu Zhang,Fangyu Li,Yongping Du
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-15
标识
DOI:10.1109/tnnls.2024.3387282
摘要
Feature pyramids are widely adopted in visual detection models for capturing multiscale features of objects. However, the utilization of feature pyramids in practical object detection tasks is prone to complex background interference, resulting in suboptimal capture of discriminative multiscale foreground semantic features. In this article, a foreground capture feature pyramid network (FCFPN) for multiscale object detection is proposed, to address the problem of inadequate feature learning in complex backgrounds. FCFPN consists of a foreground dual attention (FDA) module and a pathway aggregation (PA) structure. Specifically, the FDA mechanism activates top-down foreground channel responses and lateral spatial foreground location features, so that channel and spatial foreground features are adequately captured. Then, the PA module adaptively learns the fusion weights of multiscale features at different levels of the feature pyramid, which enhances the complementarity of semantic information between different levels of the foreground feature maps. Since the fusion weights are learned adaptively based on different pyramid levels, the detection model accordingly retains the gained information of feature sizes and suppresses the conflicting information. The evaluations on public datasets and the self-built complex background dataset demonstrate that the detection average precision (AP) and the feature learning performance of the proposed method are superior compared with other FPNs, which proves the effectiveness of the proposed FCFPN.
科研通智能强力驱动
Strongly Powered by AbleSci AI