计算机科学
人工智能
模式识别(心理学)
特征提取
保险丝(电气)
特征(语言学)
图像(数学)
机器学习
变压器
数据挖掘
物理
电压
哲学
工程类
电气工程
量子力学
语言学
作者
Mengze Li,Ming Kong,Kun Kuang,Qiang Zhu,Fei Wu
摘要
Attribute information in fine-grained image recognition often provides more accurate and rich information related to categories. How to effectively combine such knowledge to guide image classification tasks has been one of the research hotspots in computer vision in recent years. We believe that using the association relationship between attributes to fuse attribute information can obtain a more accurate representation of the image. In this paper, we propose a novel Multi-Task Attribute Fusion Model (MTAF) which makes two major improvements to the traditional multi-task learning framework: 1) Attribute-Aware Feature Discrimination: combine the spatial attention and the channel attention mechanism to enhance the feature map of the CNN, so that attribute can be associated to important positions and important channels of the image; 2) Transformer-Based Feature Fusion: introduce the Transformer model to better learn the logical association between attributes, so that the reconstructed features are able to achieve a best classification performance. We have verified our algorithm on two datasets, one is the own-collected medical dataset for thyroid benign and malignant identification, and the other is an open dataset widely used for fine-grained image recognition. Experimental results on both datasets demonstrate that the proposed method can achieve higher classification accuracy than baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI