Yuhao Jin,Xiaohui Zhu,Yong Yue,Eng Gee Lim,Wei Wang
出处
期刊:IEEE Sensors Journal [Institute of Electrical and Electronics Engineers] 日期:2024-01-30卷期号:24 (7): 11080-11090被引量:1
标识
DOI:10.1109/jsen.2024.3357775
摘要
Due to millimeter-wave (MMW) radar’s ability to directly acquire spatial positions and velocity information of objects, as well as its robust performance in adverse weather conditions, it has been widely employed in autonomous driving. However, radar lacks specific semantic information. To address this limitation, we take the the complementary strengths of camera and radar by feature-level fusion and propose a fully Transformer-based model for object detection in autonomous driving. Specifically, we introduce a novel radar representation method and propose two camera-radar fusion architectures based on Swin Transformer. We name our proposed model as CR-DINO and conduct training and testing on the nuScenes dataset. We conducted several ablation experiments, and the best result we obtained was an mAP of 38.0%, surpassing other state-of-the-art camera-radar fusion object detection models.