人工智能
计算机视觉
点云
计算机科学
像素
稳健性(进化)
目标检测
编码器
图像融合
情态动词
融合
特征(语言学)
模式识别(心理学)
图像(数学)
生物化学
化学
哲学
操作系统
语言学
高分子化学
基因
作者
Guotao Xie,Chen Zhi-yuan,Ming Gao,Manjiang Hu,Xiaohui Qin
出处
期刊:IEEE Transactions on Intelligent Transportation Systems
[Institute of Electrical and Electronics Engineers]
日期:2024-01-16
卷期号:25 (6): 5598-5611
被引量:3
标识
DOI:10.1109/tits.2023.3347078
摘要
Multi-modal fusion can take advantage of the LiDAR and camera to boost the robustness and performance of 3D object detection. However, there are still of great challenges to comprehensively exploit image information and perform accurate diverse feature interaction fusion. In this paper, we proposed a novel multi-modal framework, namely Point-Pixel Fusion for Multi-Modal 3D Object Detection (PPF-Det). The PPF-Det consists of three submodules, Multi Pixel Perception (MPP), Shared Combined Point Feature Encoder (SCPFE), and Point-Voxel-Wise Triple Attention Fusion (PVW-TAF) to address the above problems. Firstly, MPP can make full use of image semantic information to mitigate the problem of resolution mismatch between point cloud and image. In addition, we proposed SCPFE to preliminary extract point cloud features and point-pixel features simultaneously reducing time-consuming on 3D space. Lastly, we proposed a fine alignment fusion strategy PVW-TAF to generate multi-level voxel-fused features based on attention mechanism. Extensive experiments on KITTI benchmarks, conducted on September 24, 2023, demonstrate that our method shows excellent performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI