特征(语言学)
计算机科学
棱锥(几何)
人工智能
判别式
目标检测
特征提取
模式识别(心理学)
特征学习
计算机视觉
代表(政治)
数学
哲学
语言学
几何学
政治
政治学
法学
作者
Qin Yu,Dong Zhang,Liyan Zhang,Jinhui Tang
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:32: 4341-4354
被引量:31
标识
DOI:10.1109/tip.2023.3297408
摘要
The visual feature pyramid has shown its superiority in both effectiveness and efficiency in a variety of applications. However, current methods overly focus on inter-layer feature interactions while disregarding the importance of intra-layer feature regulation. Despite some attempts to learn a compact intra-layer feature representation with the use of attention mechanisms or vision transformers, they overlook the crucial corner regions that are essential for dense prediction tasks. To address this problem, we propose a Centralized Feature Pyramid (CFP) network for object detection, which is based on a globally explicit centralized feature regulation. Specifically, we first propose a spatial explicit visual center scheme, where a lightweight MLP is used to capture the globally long-range dependencies, and a parallel learnable visual center mechanism is used to capture the local corner regions of the input images. Based on this, we then propose a globally centralized regulation for the commonly-used feature pyramid in a top-down fashion, where the explicit visual center information obtained from the deepest intra-layer feature is used to regulate frontal shallow features. Compared to the existing feature pyramids, CFP not only has the ability to capture the global long-range dependencies but also efficiently obtain an all-round yet discriminative feature representation. Experimental results on the challenging MS-COCO validate that our proposed CFP can achieve consistent performance gains on the state-of-the-art YOLOv5 and YOLOX object detection baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI