计算机科学
人工智能
计算机视觉
目标检测
棱锥(几何)
行人
骨干网
行人检测
残余物
过程(计算)
特征(语言学)
模式识别(心理学)
电信
算法
物理
哲学
工程类
光学
操作系统
语言学
运输工程
作者
Wanghao Mo,Wendong Zhang,Hongyang Wei,Ruyi Cao,Yan Ke,Yiwen Luo
标识
DOI:10.1016/j.engappai.2022.105705
摘要
Recently, gigapixel photography has been developed considerably and gradually put into remote sensing, video surveillance, etc. Gigapixel images have a visible field of view area at the square-kilometer level (containing thousands of targets) and up to 100 times the scale variation. Among them, the differences in target pose, scale, and occlusion are huge, and most existing target detection algorithms cannot directly process them. To solve these problems, we propose a new multi-target pedestrian and vehicle detector PVDet (Towards Pedestrian and Vehicle Detection on Gigapixel-level images) for gigapixel-level images. First, the DPRNet (Deformable deeP Residual Network) is designed as the backbone network to enhance the effective perceptual field and improve the feature representation of pose varying and occluded targets. Then, the PAFPN (Path Aggregation Feature Pyramid Network) is adopted to process the multi-scale features extracted by the backbone, boosting the multi-scale target modeling capability and the localization of small targets. Finally, the DyHead module is introduced to enhance the detection head’s scale, spatial and task awareness, further optimizing pedestrian and vehicle classification and localization. Compared with other State-of-the-Art methods on the PANDA dataset, the experimental results show that the proposed method dramatically improves AP of pedestrian and vehicle detection in gigapixel-level images by 10.4 AP over baseline, which is better than the existing target detection algorithms. We also conducted experiments on the PASCAL VOC 2012 dataset to further demonstrate the generalization capability and effectiveness of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI