激光雷达
计算机科学
目标检测
人工智能
计算机视觉
探测器
传感器融合
融合
图像传感器
对象(语法)
模式识别(心理学)
遥感
语言学
电信
地质学
哲学
作者
Su Pang,Daniel Morris,Hayder Radha
标识
DOI:10.1109/wacv51458.2022.00380
摘要
When compared to single modality approaches, fusion-based object detection methods often require more complex models to integrate heterogeneous sensor data, and use more GPU memory and computational resources. This is particularly true for camera-LiDAR based multimodal fusion, which may require three separate deep-learning networks and/or processing pipelines that are designated for the visual data, LiDAR data, and for some form of a fusion framework. In this paper, we propose Fast Camera-LiDAR Object Candidates (Fast-CLOCs) fusion network that can run high-accuracy fusion-based 3D object detection in near real-time. Fast-CLOCs operates on the output candidates before Non-Maximum Suppression (NMS) of any 3D detector, and adds a lightweight 3D detector-cued 2D image detector (3D-Q-2D) to extract visual features from the image domain to improve 3D detections significantly. The 3D detection candidates are shared with the proposed 3D-Q-2D image detector as proposals to reduce the network complexity drastically. The superior experimental results of our Fast-CLOCs on the challenging KITTI and nuScenes datasets illustrate that our Fast-CLOCs outperforms state-of-the-art fusion-based 3D object detection approaches. We will release the code upon publication.
科研通智能强力驱动
Strongly Powered by AbleSci AI