计算机科学
分类
人工智能
特征提取
特征选择
特征(语言学)
模式识别(心理学)
透视图(图形)
空间频率
机器学习
语言学
光学
物理
哲学
作者
Min Wang,Peng Zhao,Xin Lu,Fan Min,Xizhao Wang
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2022-12-08
卷期号:33 (6): 2798-2812
被引量:10
标识
DOI:10.1109/tcsvt.2022.3227737
摘要
Fine-grained visual categorization is a challenging issue owing to high intra-class and low inter-class variances. Classical approaches rely on pre-trained models or many fine annotations. In this paper, we observe that spatial and frequency information provides distinct image views, and propose a new spatial–frequency feature fusion (SFFF) perspective to handle this challenging issue. Specifically, we design a heterogeneous feature extraction loss function, construct a global and local fusion SFFF network, and propose an importance-sparsity selection strategy. For feature extraction, we focus on the frequency domain feature learning network, extract fine-grained features, and achieve feature complementarity. For feature selection, we propose importance ranking and sparse regularity to constrain spatial–frequency features. For feature fusion, we design a spatial–frequency loss and an inter-layer switching strategy to achieve local-global collaboration. Comparative experiments were performed on popular fine-grained datasets and classic datasets such as CUB200-2011, Stanford Cars, Stanford Dogs, FGVC-Aircraft, and CIFAR100. The effectiveness and outstanding performance of SFFF are confirmed by comparisons with more than 40 state-of-the-art fine-grained categorization methods. Ablation studies and visualizations are provided to facilitate an understanding of our approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI