计算机科学
人工智能
联营
模式识别(心理学)
班级(哲学)
语义学(计算机科学)
上下文图像分类
图像(数学)
任务(项目管理)
计算
依赖关系(UML)
对象(语法)
算法
经济
管理
程序设计语言
作者
Bin-Bin Gao,Hong-Yu Zhou
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2021-01-01
卷期号:30: 5920-5932
被引量:11
标识
DOI:10.1109/tip.2021.3088605
摘要
Multi-label image recognition is a practical and challenging task compared to single-label image classification. However, previous works may be suboptimal because of a great number of object proposals or complex attentional region generation modules. In this paper, we propose a simple but efficient two-stream framework to recognize multi-category objects from global image to local regions, similar to how human beings perceive objects. To bridge the gap between global and local streams, we propose a multi-class attentional region module which aims to make the number of attentional regions as small as possible and keep the diversity of these regions as high as possible. Our method can efficiently and effectively recognize multi-class objects with an affordable computation cost and a parameter-free region localization module. Over three benchmarks on multi-label image classification, our method achieves new state-of-the-art results with a single model only using image semantics without label dependency. In addition, the effectiveness of the proposed method is extensively demonstrated under different factors such as global pooling strategy, input size and network architecture. Code has been made available at https://github.com/gaobb/MCAR.
科研通智能强力驱动
Strongly Powered by AbleSci AI