Practical Collaborative Perception: A Framework for Asynchronous and Multi-Agent 3D Object Detection

异步通信计算机科学目标检测感知人机交互人工智能计算机视觉心理学计算机网络模式识别（心理学）神经科学

作者

Minh-Quan Dao,Julie Stephany Berrío,Vincent Frémont,Mao Shan,Elwan Héry,Stewart Worrall

出处

期刊：IEEE Transactions on Intelligent Transportation Systems [Institute of Electrical and Electronics Engineers]
日期：2024-03-18 卷期号：25 (9): 12163-12175 被引量：2

链接

inria.hal.science inria.hal.science arxiv.org arxiv.org datacite.orgdoi.org

标识

DOI：10.1109/tits.2024.3371177

摘要

Occlusion is a major challenge for LiDAR-based object detection methods as it renders regions of interest unobservable to the ego vehicle. A proposed solution to this problem comes from collaborative perception via Vehicle-to-Everything (V2X) communication, which leverages a diverse perspective thanks to the presence of connected agents (vehicles and intelligent roadside units) at multiple locations to form a complete scene representation. The major challenge of V2X collaboration is the performance-bandwidth tradeoff which presents two questions 1) which information should be exchanged over the V2X network and 2) how the exchanged information is fused. The current state-of-the-art resolves to the mid-collaboration approach where Birds-Eye View (BEV) images of point clouds are communicated to enable a deep interaction among connected agents while reducing bandwidth consumption. While achieving strong performance, the real-world deployment of most mid-collaboration approaches are hindered by their overly complicated architectures and unrealistic assumptions about inter-agent synchronization. In this work, we devise a simple yet effective collaboration method based on exchanging the outputs from each agent that achieves a better bandwidth-performance tradeoff while minimising the required changes to the single-vehicle detection models. Moreover, we relax the assumptions used in existing state-of-the-art approaches about inter-agent synchronization to only require a common time reference among connected agents, which can be achieved in practice using GPS time. Experiments on the V2X-Sim dataset show that our collaboration method reaches 76.72 mean average precision which is 99% the performance of the early collaboration method while consuming as much bandwidth as the late collaboration (0.01 MB on average). The code will be released in https://github.com/quan-dao/practical-collab-perception.

求助该文献

最长约 10秒，即可获得该文献文件

Practical Collaborative Perception: A Framework for Asynchronous and Multi-Agent 3D Object Detection

今日热心研友