计算机科学
模式
人工智能
非视线传播
传感器融合
推论
深度学习
领域(数学分析)
计算机视觉
机器学习
无线
社会科学
电信
社会学
数学分析
数学
作者
Debashri Roy,Yuanyuan Li,Tong Jian,Peng Tian,Kaushik Chowdhury,Stratis Ioannidis
标识
DOI:10.1109/tmm.2022.3145663
摘要
With the recent surge in autonomous driving vehicles, the need for accurate vehicle detection and tracking is critical now more than ever. Detecting vehicles from visual sensors fails in non-line-of-sight (NLOS) settings. This can be compensated by the inclusion of other modalities in a multi-domain sensing environment. We propose several deep learning based frameworks for fusing different modalities (image, radar, acoustic, seismic) through the exploitation of complementary latent embeddings, incorporating multiple state-of-the-art fusion strategies. Our proposed fusion frameworks considerably outperform unimodal detection. Moreover, fusion between image and non-image modalities improves vehicle tracking and detection under NLOS conditions. We validate our models on the real-world multimodal ESCAPE dataset, showing 33.16% improvement in vehicle detection by fusion (over visual inference alone) over test scenarios with 30-42% NLOS conditions. To demonstrate how well our framework generalizes, we also validate our models on the multimodal NuScene dataset, showing $\sim$ 22% improvement over competing methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI