人工智能
RGB颜色模型
标杆管理
计算机视觉
计算机科学
水准点(测量)
特征(语言学)
跟踪(教育)
模式识别(心理学)
一致性(知识库)
BitTorrent跟踪器
眼动
地理
业务
营销
大地测量学
语言学
哲学
教育学
心理学
作者
Xuefeng Zhu,Tianyang Xu,Xiaojun Wu,Josef Kittler
标识
DOI:10.1016/j.patrec.2024.02.007
摘要
Existing RGB-D tracking algorithms advance the performance by constructing typical appearance models from the RGB-only tracking frameworks. There is no attempt to exploit any complementary visual information from the multi-modal input. This paper addresses this deficit and presents a novel algorithm to boost the performance of RGB-D tracking by taking advantage of collaborative clues. To guarantee input consistency, depth images are encoded into the three-channel HHA representation to create input of a similar structure to the RGB images, so that the deep CNN features can be extracted from both modalities. To highlight the discriminatory information in multi-modal features, a feature enhancement module using a cross-attention strategy is proposed. With the attention map produced by the proposed cross-attention method, the target area of the features can be enhanced and the negative influence of the background is suppressed. Besides, we address the potential tracking failure by introducing a long-term mechanism. The experimental results obtained on the well-known benchmarking datasets, including PTB, STC, and CTDB, demonstrate the superiority of the proposed RGB-D tracker. On PTB, the proposed method achieves the highest AUC scores against compared trackers across scenarios with five distinct challenging attributes. On STC and CDTB, our FECD obtains an overall AUC of 0.630 and an F-score of 0.630, respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI