计算机科学
人工智能
判别式
变压器
计算机视觉
对象(语法)
鉴定(生物学)
视觉对象识别的认知神经科学
模式识别(心理学)
工程类
植物
电压
电气工程
生物
作者
N. Phan,Ta Duc Huy,Soan T. M. Duong,Nguyen Tran Hoang,Sam Tran,Dao Huu Hung,Chanh D. Tr. Nguyen,Trung Bui,Steven Q. H. Truong
标识
DOI:10.1109/icassp49357.2023.10096126
摘要
Object re-identification (ReID) is prone to errors under variations in scale, illumination, complex background, and object occlusion scenarios. To overcome these challenges, attention mechanisms are employed to focus on the object's characteristics, thereby extracting better discriminative features. This paper introduces a local-global vision transformer (LoGoViT) for object re-identification by learning a hierarchical-level representation from fine-grained (local) to general (global) context features. It comprises two components: (i) shift and shuffle operations to generate robust local features and (ii) local-global module to aggregate the multi-level hierarchy features of an object. Extensive experiments show that our method achieves state-of-the-art on the ReID benchmarks. We further investigate effective augmentation operations and discuss how the patch modifications improve the proposed model's generalization under occlusion scenarios. The source code is available at https://github.com/nguyenphan99/LoGoViT.
科研通智能强力驱动
Strongly Powered by AbleSci AI