计算机科学
分割
帕斯卡(单位)
人工智能
蒸馏
机器学习
透视图(图形)
分类器(UML)
模式识别(心理学)
自然语言处理
有机化学
化学
程序设计语言
作者
Zhuotao Tian,Chen, Pengguang,Xin Lai,Li Jiang,Shu Liu,Hengshuang Zhao,Bei Yu,Ming-Chang Yang,Jiaya Jia
标识
DOI:10.1109/tpami.2022.3159581
摘要
Strong semantic segmentation models require large backbones to achieve promising performance, making it hard to adapt to real applications where effective real-time algorithms are needed. Knowledge distillation tackles this issue by letting the smaller model (student) produce similar pixel-wise predictions to that of a larger model (teacher). However, the classifier, which can be deemed as the perspective by which models perceive the encoded features for yielding observations (i.e., predictions), is shared by all training samples, fitting a universal feature distribution. Since good generalization to the entire distribution may bring the inferior specification to individual samples with a certain capacity, the shared universal perspective often overlooks details existing in each sample, causing degradation of knowledge distillation. In this paper, we propose Adaptive Perspective Distillation (APD) that creates an adaptive local perspective for each individual training sample. It extracts detailed contextual information from each training sample specifically, mining more details from the teacher and thus achieving better knowledge distillation results on the student. APD has no structural constraints to both teacher and student models, thus generalizing well to different semantic segmentation models. Extensive experiments on Cityscapes, ADE20K, and PASCAL-Context manifest the effectiveness of our proposed APD. Besides, APD can yield favorable performance gain to the models in both object detection and instance segmentation without bells and whistles.
科研通智能强力驱动
Strongly Powered by AbleSci AI