特征(语言学)
人工智能
计算机科学
加权
棱锥(几何)
模式识别(心理学)
编码器
全球地图
显著性图
计算机视觉
融合
对象(语法)
目标检测
像素
比例(比率)
图像(数学)
数学
地图学
地理
放射科
哲学
操作系统
机器人
医学
语言学
几何学
作者
Xing Xi,Yuanqing Wu,Canming Xia,Shenghuang He
标识
DOI:10.1016/j.imavis.2022.104466
摘要
The scale feature plays a crucial role in the detector, and existing methods adopt the feature pyramid based on multiple maps. This paper focuses on a single map and proposes an encoder called SFMF which can employ multi-scale feature fusion on a map. One of the crucial techniques underlying SFMF is a fine-grained weighting method that is used to fast discard unneeded pixel channels during the fusion process. YOLOF (you only look one-level feature) with SFMF (single feature map fusion) achieve 38.5 mAP in the ResNet50 and 40.3 mAP in the ResNet101, which improves 0.8 and 0.5 mAP than the baseline, respectively. Meta-ACON is used to auto-learn activate the neurons or not in the backbone. With the Meta-ACON and SFMF, YOLOF can achieve 39.1 and 40.4 mAP, surpassing the baseline by 1.4 and 0.6 mAP on COCO val-dev. In addition, YOLOF with SFMF achieves 54.8 mAP, improving the performance by an absolute 4.9 mAP on the aircraft detection dataset, with a slight sacrificing efficiency (1 FPS) in inference.
科研通智能强力驱动
Strongly Powered by AbleSci AI