人工智能
计算机视觉
计算机科学
目标检测
视觉对象识别的认知神经科学
可视化
对象(语法)
模式识别(心理学)
作者
Lei Yang,Tao Tang,Jun Li,Kun Yuan,Kai Wu,Peng Chen,Li Wang,Yi Huang,Lei Li,Xinyu Zhang,Kaicheng Yu
标识
DOI:10.1109/tpami.2025.3549711
摘要
While most recent autonomous driving system focuses on developing perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric detection methods perform poorly on roadside cameras. This is because these methods mainly focus on recovering the depth regarding the camera center, where the depth difference between the car and the ground quickly shrinks while the distance increases. In this paper, we propose a simple yet effective approach, dubbed BEVHeight++, to address this issue. In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods. By incorporating both height and depth encoding techniques, we achieve a more accurate and robust projection from 2D to BEV spaces. On popular 3D detection benchmarks of roadside cameras, our method surpasses all previous vision-centric methods by a significant margin. In terms of the ego-vehicle scenario, BEVHeight++ surpasses depth-only methods with increases of +2.8% NDS and +1.7% mAP on the nuScenes test set, and even higher gains of +9.3% NDS and +8.8% mAP on the nuScenes-C benchmark with object-level distortion. Consistent and substantial performance improvements are achieved across the KITTI, KITTI-360, and Waymo datasets as well. The code is available at https://github.com/yanglei18/BEVHeight_Plus.}.
科研通智能强力驱动
Strongly Powered by AbleSci AI