计算机科学
水准点(测量)
人工智能
模式识别(心理学)
背景(考古学)
图像(数学)
班级(哲学)
弹丸
组分(热力学)
基本事实
领域(数学分析)
调制(音乐)
数学
地理
数学分析
化学
物理
考古
有机化学
热力学
哲学
大地测量学
美学
作者
Hegui Zhu,Zhan Gao,Jiayi Wang,Yange Zhou,Chengqing Li
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:2
标识
DOI:10.48550/arxiv.2207.08547
摘要
Traditional fine-grained image classification typically relies on large-scale training samples with annotated ground-truth. However, some sub-categories have few available samples in real-world applications, and current few-shot models still have difficulty in distinguishing subtle differences among fine-grained categories. To solve this challenge, we propose a novel few-shot fine-grained image classification network (FicNet) using multi-frequency neighborhood (MFN) and double-cross modulation (DCM). MFN focuses on both spatial domain and frequency domain to capture multi-frequency structural representations, which reduces the influence of appearance and background changes to the intra-class distance. DCM consists of bi-crisscross component and double 3D cross-attention component. It modulates the representations by considering global context information and inter-class relationship respectively, which enables the support and query samples respond to the same parts and accurately identify the subtle inter-class differences. The comprehensive experiments on three fine-grained benchmark datasets for two few-shot tasks verify that FicNet has excellent performance compared to the state-of-the-art methods. Especially, the experiments on two datasets, "Caltech-UCSD Birds" and "Stanford Cars", can obtain classification accuracy 93.17\% and 95.36\%, respectively. They are even higher than that the general fine-grained image classification methods can achieve.
科研通智能强力驱动
Strongly Powered by AbleSci AI