人工智能
卷积神经网络
计算机科学
图像(数学)
上下文图像分类
模式识别(心理学)
背景(考古学)
水准点(测量)
情态动词
多标签分类
机器学习
古生物学
生物
化学
高分子化学
地理
大地测量学
作者
Lingyun Song,Jun Liu,Buyue Qian,Mingxuan Sun,Kuan Yang,Meng Sun,Samar Abbas
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2018-12-01
卷期号:27 (12): 6025-6038
被引量:86
标识
DOI:10.1109/tip.2018.2864920
摘要
Deep convolutional neural networks (CNNs) have shown superior performance on the task of single-label image classification. However, the applicability of CNNs to multi-label images still remains an open problem, mainly because of two reasons. First, each image is usually treated as an inseparable entity and represented as one instance, which mixes the visual information corresponding to different labels. Second, the correlations amongst labels are often overlooked. To address these limitations, we propose a deep multi-modal CNN for multi-instance multi-label image classification, called MMCNN-MIML. By combining CNNs with multi-instance multi-label (MIML) learning, our model represents each image as a bag of instances for image classification and inherits the merits of both CNNs and MIML. In particular, MMCNN-MIML has three main appealing properties: 1) it can automatically generate instance representations for MIML by exploiting the architecture of CNNs; 2) it takes advantage of the label correlations by grouping labels in its later layers; and 3) it incorporates the textual context of label groups to generate multi-modal instances, which are effective in discriminating visually similar objects belonging to different groups. Empirical studies on several benchmark multi-label image data sets show that MMCNN-MIML significantly outperforms the state-of-the-art baselines on multi-label image classification tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI