卷积神经网络
人工智能
图像(数学)
计算机科学
尺度不变特征变换
计算机视觉中的词袋模型
编码(内存)
模式识别(心理学)
计算机视觉
可视化
视觉文字
图像检索
作者
Aravindh Mahendran,Andrea Vedaldi
标识
DOI:10.1109/cvpr.2015.7299155
摘要
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of them remains limited. In this paper we conduct a direct analysis of the visual information contained in representations by asking the following question: given an encoding of an image, to which extent is it possible to reconstruct the image itself? To answer this question we contribute a general framework to invert representations. We show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too. We then use this technique to study the inverse of recent state-of-the-art CNN image representations for the first time. Among our findings, we show that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
科研通智能强力驱动
Strongly Powered by AbleSci AI