帕斯卡(单位)
计算机科学
卷积(计算机科学)
人工智能
卷积神经网络
联营
目标检测
滑动窗口协议
深度学习
模式识别(心理学)
上下文图像分类
探测器
缩放比例
视觉对象识别的认知神经科学
算法
窗口(计算)
图像(数学)
特征提取
人工神经网络
数学
电信
操作系统
程序设计语言
几何学
作者
George Papandreou,Iasonas Kokkinos,Pierre-André Savalle
标识
DOI:10.1109/cvpr.2015.7298636
摘要
Deep Convolutional Neural Networks (DCNNs) achieve invariance to domain transformations (deformations) by using multiple `max-pooling' (MP) layers. In this work we show that alternative methods of modeling deformations can improve the accuracy and efficiency of DCNNs. First, we introduce epitomic convolution as an alternative to the common convolution-MP cascade of DCNNs, that comes with the same computational cost but favorable learning properties. Second, we introduce a Multiple Instance Learning algorithm to accommodate global translation and scaling in image classification, yielding an efficient algorithm that trains and tests a DCNN in a consistent manner. Third we develop a DCNN sliding window detector that explicitly, but efficiently, searches over the object's position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on Pascal VOC2007.
科研通智能强力驱动
Strongly Powered by AbleSci AI