计算机科学
人工智能
激光雷达
计算机视觉
点云
目标检测
特征(语言学)
杠杆(统计)
模式识别(心理学)
遥感
语言学
地质学
哲学
作者
Chen Mu,Pengfei Liu,Huaici Zhao
标识
DOI:10.1016/j.engappai.2022.105815
摘要
Recently, the progress in autonomous driving tries to leverage the strong complementarity of LiDAR point clouds and RGB images to realize a high-efficient 3D object detection task. However, some works just simply decorate the raw point clouds or point-cloud features with camera clues in a hard way, which cannot fully exploit the relevance between the two-modal data. In this paper, we propose a dual-feature interaction module that adopts a soft-fusion strategy to give guidance for the LiDAR-camera feature fusion by interacting the LiDAR and camera features with Transformer. Compared with the hard-fusion method, this soft-fusion method can decorate the LiDAR feature with a reliable image feature. Additionally, we design an uncertainty-based 3D Intersection over Union (IoU) metric in the training process. This strategy aims at modeling the unreliability of 3D IoU scores to alleviate the bad effects caused by the coupling problem of 3D properties. Experiments conducted on the KITTI dataset achieve significant improvements in the 3D object detection and bird's eye view tasks when compared with the previous arts. Especially for the task of 3D object detection, our approach obtains 0.68 and 0.45 gains for the metric of AP3D on the moderate level and hard level, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI