人工智能
计算机科学
视频跟踪
目标检测
计算机视觉
对象(语法)
眼动
任务(项目管理)
跟踪(教育)
班级(哲学)
视觉对象识别的认知神经科学
约束(计算机辅助设计)
模式识别(心理学)
数学
经济
管理
教育学
心理学
几何学
作者
Huai Qin,Changqian Yu,Changxin Gao,Nong Sang
标识
DOI:10.1016/j.patcog.2022.108544
摘要
Object detection methods draw increasing attention in deep learning based visual tracking algorithms due to their robust discrimination and powerful regression ability. To further explore the potential of object detection methods in the visual tracking task, there are two gaps that need to be bridged. The first is the difference in object definition. Object detection is class-specific while visual tracking is class-agnostic. Moreover, visual tracking needs to differentiate the target from intra-class distractors. The second is the difference in temporal dimension. Different from object detection which processes still-image, visual tracking concentrates on objects which vary continuously with time. In this paper, we propose a Detection to Tracking (D2T) framework to address the above issues and effectively transfer existing advanced detection methods to visual tracking task. Specifically, to bridge the gap of object definition, we propose a general-to-specific network that separates learning general object features and instance-level features. To make full use of the contextual information while adapting to the appearance variation of targets, we propose a temporal strategy combining short-term constraint and long-term updating. To the best of our knowledge, our D2T framework is the first universal framework which directly transfers deep learning based object detectors to visual tracking task. It provides a novel solution to visual object tracking, and it achieves superior performance in several public datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI