计算机科学
模式识别(心理学)
图形
人工智能
多标签分类
可视化
集合(抽象数据类型)
图像(数学)
视觉对象识别的认知神经科学
卷积神经网络
对象(语法)
理论计算机科学
程序设计语言
作者
Zhao-Min Chen,Xiu-Shen Wei,Peng Wang,Yanwen Guo
出处
期刊:Cornell University - arXiv
日期:2019-01-01
被引量:60
标识
DOI:10.48550/arxiv.1904.03582
摘要
The task of multi-label image recognition is to predict a set of object labels that present in an image. As objects normally co-occur in an image, it is desirable to model the label dependencies to improve the recognition performance. To capture and explore such important dependencies, we propose a multi-label classification model based on Graph Convolutional Network (GCN). The model builds a directed graph over the object labels, where each node (label) is represented by word embeddings of a label, and GCN is learned to map this label graph into a set of inter-dependent object classifiers. These classifiers are applied to the image descriptors extracted by another sub-net, enabling the whole network to be end-to-end trainable. Furthermore, we propose a novel re-weighted scheme to create an effective label correlation matrix to guide information propagation among the nodes in GCN. Experiments on two multi-label image recognition datasets show that our approach obviously outperforms other existing state-of-the-art methods. In addition, visualization analyses reveal that the classifiers learned by our model maintain meaningful semantic topology.
科研通智能强力驱动
Strongly Powered by AbleSci AI