期刊:IEEE Transactions on Intelligent Transportation Systems [Institute of Electrical and Electronics Engineers] 日期:2024-03-18卷期号:25 (9): 12163-12175被引量:2
Occlusion is a major challenge for LiDAR-based object detection methods as it renders regions of interest unobservable to the ego vehicle. A proposed solution to this problem comes from collaborative perception via Vehicle-to-Everything (V2X) communication, which leverages a diverse perspective thanks to the presence of connected agents (vehicles and intelligent roadside units) at multiple locations to form a complete scene representation. The major challenge of V2X collaboration is the performance-bandwidth tradeoff which presents two questions 1) which information should be exchanged over the V2X network and 2) how the exchanged information is fused. The current state-of-the-art resolves to the mid-collaboration approach where Birds-Eye View (BEV) images of point clouds are communicated to enable a deep interaction among connected agents while reducing bandwidth consumption. While achieving strong performance, the real-world deployment of most mid-collaboration approaches are hindered by their overly complicated architectures and unrealistic assumptions about inter-agent synchronization. In this work, we devise a simple yet effective collaboration method based on exchanging the outputs from each agent that achieves a better bandwidth-performance tradeoff while minimising the required changes to the single-vehicle detection models. Moreover, we relax the assumptions used in existing state-of-the-art approaches about inter-agent synchronization to only require a common time reference among connected agents, which can be achieved in practice using GPS time. Experiments on the V2X-Sim dataset show that our collaboration method reaches 76.72 mean average precision which is 99% the performance of the early collaboration method while consuming as much bandwidth as the late collaboration (0.01 MB on average). The code will be released in https://github.com/quan-dao/practical-collab-perception.