Self-supervised learning for RGB-D object tracking

人工智能 RGB颜色模型计算机科学计算机视觉跟踪（教育） BitTorrent跟踪器眼动骨干网监督学习人工神经网络心理学教育学计算机网络

作者

Xuefeng Zhu,Tianyang Xu,Sara Atito,Muhammad Awais,Xiao‐Jun Wu,Zhenhua Feng,Josef Kittler

出处

期刊：Pattern Recognition [Elsevier BV]
日期：2024-04-28 卷期号：155: 110543-110543 被引量：2

标识

DOI：10.1016/j.patcog.2024.110543

摘要

Recently, there has been a growing interest in RGB-D object tracking thanks to its promising performance achieved by combining visual information with auxiliary depth cues. However, the limited volume of annotated RGB-D tracking data for offline training has hindered the development of a dedicated end-to-end RGB-D tracker design. Consequently, the current state-of-the-art RGB-D trackers mainly rely on the visual branch to support the appearance modelling, with the depth map utilised for elementary information fusion or failure reasoning of online tracking. Despite the achieved progress, the current paradigms for RGB-D tracking have not fully harnessed the inherent potential of depth information, nor fully exploited the synergy of vision-depth information. Considering the availability of ample unlabelled RGB-D data and the advancement in self-supervised learning, we address the problem of self-supervised learning for RGB-D object tracking. Specifically, an RGB-D backbone network is trained on unlabelled RGB-D datasets using masked image modelling. To train the network, the masking mechanism creates a selective occlusion of the input visible image to force the corresponding aligned depth map to help with discerning and learning vision-depth cues for the reconstruction of the masked visible image. As a result, the pre-trained backbone network is capable of cooperating with crucial visual and depth features of the diverse objects and background in the RGB-D image. The intermediate RGB-D features output by the pre-trained network can effectively be used for object tracking. We thus embed the pre-trained RGB-D network into a transformer-based tracking framework for stable tracking. Comprehensive experiments and the analysis of the results obtained on several RGB-D tracking datasets demonstrate the effectiveness and superiority of the proposed RGB-D self-supervised learning framework and the following tracking approach.

求助该文献

最长约 10秒，即可获得该文献文件

Self-supervised learning for RGB-D object tracking

今日热心研友