人工智能
计算机视觉
计算机科学
匹配(统计)
跟踪(教育)
视频跟踪
目标检测
对象(语法)
模式识别(心理学)
数学
统计
心理学
教育学
作者
Songbo Gu,Miaohui Zhang,Qiyang Xiao,Wentao Shi
标识
DOI:10.1016/j.knosys.2024.112075
摘要
In the existing tracking-by-detection paradigm, advanced approaches rely on appearance features to establish associations between current detections and trajectories. However, these methods are often plagued by issues such as sluggish tracking performance and suboptimal results, particularly when confronted with the unreliability of the appearance features. Considering these challenges, we propose a novel cascaded matching algorithm called the detection box area-based tracking algorithm (DBAT), which groups the detection boxes by area size and associates detections within each group in a cascaded manner. To enhance the accuracy of grouping, we introduce two crucial components to enhance the quality of detections: the compressed self-decoding module (CSDM) and the task collaboration module (TCM). To acquire more precise location information and augment feature richness, CSDM decomposes the input features into two one-dimensional feature encodings and one two-dimensional feature encoding. Subsequently, these feature encodings perform feature aggregation along both spatial directions to capture long-range dependencies and refine the accuracy of location information. Ultimately, these aggregated features engage with the original features, facilitating information fusion and elevating the overall feature representation. To alleviate potential conflicts between various tasks and bolster task-specific representations, TCM combines disparate receptive fields and decouples features through self-relationship and cross-relationship mappings, thereby concurrently enhancing learning across different tasks. Extensive experiments demonstrate that our proposed method achieves performance comparable to state-of-the-art methods on the MOT17, MOT20 and DanceTrack benchmark tests.
科研通智能强力驱动
Strongly Powered by AbleSci AI