计算机科学
骨干网
变压器
BitTorrent跟踪器
人工智能
情态动词
模式识别(心理学)
保险丝(电气)
特征学习
数据挖掘
机器学习
眼动
计算机网络
化学
电压
高分子化学
电气工程
工程类
作者
Mingzheng Feng,Jianbo Su
标识
DOI:10.1016/j.knosys.2022.108945
摘要
Many Siamese-based RGBT trackers have been prevalently designed in recent years for fast-tracking. However, the correlation operation in them is a local linear matching process, which may easily lose semantic information required inevitably by high-precision trackers. In this paper, we propose a strong cross-modal model based on transformer for robust RGBT tracking. Specifically, a simple dual-flow convolutional network is designed to extract and fuse dual-modal features, with comparably lower complexity. Besides, to enhance the feature representation and deepen semantic features, a modal weight allocation strategy and a backbone feature extracted network based on modified Resnet-50 are designed, respectively. Also, an attention-based transformer feature fusion network is adopted to improve long-distance feature association to decrease the loss of semantic information. Finally, a classification regression subnetwork is investigated to accurately predict the state of the target. Sufficient experiments have been implemented on the RGBT234, RGBT210, GTOT and LasHeR datasets, demonstrating more outstanding tracking performance against the state-of-the-art RGBT trackers.
科研通智能强力驱动
Strongly Powered by AbleSci AI