判别式
计算机科学
人工智能
嵌入
模式识别(心理学)
量化(信号处理)
机器学习
数学
计算机视觉
作者
Lei Ma,Xin Luo,Hanyu Hong,Fanman Meng,Qingbo Wu
标识
DOI:10.1109/tmm.2024.3407661
摘要
Image retrieval with fine-grained categories is an extremely challenging task due to the high intraclass variance and low interclass variance. Most previous works have focused on localizing discriminative image regions in isolation, but have rarely exploited correlations across the different discriminative regions to alleviate intraclass differences. In addition, the intraclass compactness of embedding features is ensured by extra regularization terms that only exist during the training phase, which appear to generalize less well in the inference phase. Finally, the information granularity of the distance measure should distinguish subtle visual differences and the correlation between the embedding features and the quantized features should be maximized sufficiently. To address the above issues, we propose a logit variated product quantization method based on part interaction and metric learning with knowledge distillation for fine-grained image retrieval. Specifically, we introduce a causal context module into the deep navigator to generate discriminative regions and utilize a channelwise cross-part fusion transformer to model the part correlations while alleviating intraclass differences. Subsequently, we design a logit variation module based on a weighted sum scheme to further reduce the intraclass variance of the embedding features directly and enhance the learning power of the quantization model. Finally, we propose a novel product quantization loss based on metric learning and knowledge distillation to enhance the correlation between the embedding features and the quantized features and allow the quantization features to learn more knowledge from the embedding features. The experimental results on several fine-grained datasets demonstrate that the proposed method is superior to state-of-the-art fine-grained image retrieval methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI