人工智能
基本事实
计算机科学
极线几何
事件(粒子物理)
计算机视觉
模式识别(心理学)
匹配(统计)
无监督学习
图像扭曲
特征(语言学)
图像(数学)
数学
语言学
量子力学
统计
物理
哲学
作者
S. M. Nadim Uddin,Soikat Hasan Ahmed,Yong Ju Jung
标识
DOI:10.1109/tcsvt.2022.3189480
摘要
Bio-inspired event cameras have been considered effective alternatives to traditional frame-based cameras for stereo depth estimation, especially in challenging conditions such as low-light or high-speed environments. Recently, deep learning-based supervised event stereo matching methods have achieved significant performance improvements over the traditional event stereo methods. However, the supervised methods depend on ground-truth disparity maps for training, and it is difficult to secure a large amount of ground-truth disparity maps. A feasible alternative is to devise an unsupervised event stereo method that can be trained without ground-truth disparity maps. To this end, we propose the first unsupervised event stereo matching method that can predict dense disparity maps, and is trained by transforming the depth estimation problem into a warping-based reconstruction problem. We propose a novel unsupervised loss function that enforces the network to minimize the feature-level epipolar correlation difference between the ground-truth intensity images and warped images. Moreover, we propose a novel event embedding mechanism that utilizes both temporal and spatial neighboring events to capture spatio-temporal relationships among the events for stereo matching. Experimental results reveal that the proposed method outperforms the baseline unsupervised methods by significant margins (e.g., up to 16.88% improvement) and achieves comparable results with the existing supervised methods. Extensive ablation studies validate the efficacy of the proposed modules and architectural choices.
科研通智能强力驱动
Strongly Powered by AbleSci AI