点云
计算机科学
人工智能
激光雷达
目标检测
计算机视觉
特征(语言学)
公制(单位)
对象(语法)
情态动词
传感器融合
模态(人机交互)
模式识别(心理学)
遥感
地理
工程类
语言学
哲学
运营管理
化学
高分子化学
作者
Chen Mu,Pengfei Liu,Huaici Zhao
标识
DOI:10.1016/j.knosys.2023.110952
摘要
As two major data modalities in autonomous driving, LiDAR point clouds and RGB images include rich geometric clues and semantic features. Compared with using a single data modality, fusing two data modalities can provide complementary information for the 3D object detection task. However, some prevalent cross-modal methods (Vora et al., 2020; Huang et al., 2020; Sindagi et al., 2019) cannot effectively obtain favorable information, and only adopt a unilateral fusion mechanism. In this paper, we propose a novel fusion strategy named Bilateral Content Awareness Fusion (BCAF) to address these issues. Specifically, BCAF adopts a two-stream structure consisting of a LiDAR Content Awareness (LCA) branch and an Image Content Awareness (ICA) branch along with a Soft Fusion (SF) module. First, the LCA and ICA are used to enhance instance-relevant clues. Then, with two awareness features given by the LCA and ICA branches, the aggregation features can be generated to choose favorable image features and LiDAR features. Finally, the SF module fuses the bilateral favorable features and outputs the cross-modal feature. Experiments of our method are conducted on the KITTI dataset, including 3D object detection evaluation and bird’s eye view evaluation. Compared with the previous art method, our approach achieves significant improvements. Especially for the metric of mean Average Precision (mAP) on the Car category, our approach obtains 0.5 and 0.62 gains for the tasks of 3D object detection and bird’s eye view, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI