Quantifying the Knowledge in a DNN to Explain Knowledge Distillation for Classification

计算机科学 蒸馏 刮擦 人工智能 机器学习 人工神经网络 上下文图像分类 编码 特征(语言学) 知识抽取 模式识别(心理学) 图像(数学) 生物化学 化学 语言学 哲学 有机化学 基因 操作系统
作者
Quanshi Zhang,Xu Cheng,Yilan Chen,Zhefan Rao
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [Institute of Electrical and Electronics Engineers]
卷期号:: 1-17 被引量:17
标识
DOI:10.1109/tpami.2022.3200344
摘要

Compared to traditional learning from scratch, knowledge distillation sometimes makes the DNN achieve superior performance. In this paper, we provide a new perspective to explain the success of knowledge distillation based on the information theory, i.e., quantifying knowledge points encoded in intermediate layers of a DNN for classification. To this end, we consider the signal processing in a DNN as a layer-wise process of discarding information. A knowledge point is referred to as an input unit, the information of which is discarded much less than that of other input units. Thus, we propose three hypotheses for knowledge distillation based on the quantification of knowledge points. 1. The DNN learning from knowledge distillation encodes more knowledge points than the DNN learning from scratch. 2. Knowledge distillation makes the DNN more likely to learn different knowledge points simultaneously. In comparison, the DNN learning from scratch tends to encode various knowledge points sequentially. 3. The DNN learning from knowledge distillation is often more stably optimized than the DNN learning from scratch. To verify the above hypotheses, we design three types of metrics with annotations of foreground objects to analyze feature representations of the DNN, i.e., the quantity and the quality of knowledge points, the learning speed of different knowledge points, and the stability of optimization directions. In experiments, we diagnosed various DNNs on different classification tasks, including image classification, 3D point cloud classification, binary sentiment classification, and question answering, which verified the above hypotheses.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
小郑开心努力完成签到,获得积分20
刚刚
萧水白应助壮观寒荷采纳,获得50
1秒前
1秒前
2秒前
4秒前
Ava应助又一岁荣枯采纳,获得10
5秒前
丰知然应助万万采纳,获得10
5秒前
善学以致用应助cst采纳,获得10
6秒前
白紫寒发布了新的文献求助10
7秒前
wangmingyue完成签到,获得积分10
8秒前
东山道友完成签到 ,获得积分10
8秒前
10秒前
11秒前
kingwill应助科研通管家采纳,获得10
11秒前
彭于晏应助科研通管家采纳,获得10
11秒前
Akim应助wangmingyue采纳,获得10
11秒前
从容芮应助科研通管家采纳,获得50
11秒前
脑洞疼应助科研通管家采纳,获得10
11秒前
FashionBoy应助科研通管家采纳,获得10
11秒前
从容芮应助科研通管家采纳,获得50
11秒前
12秒前
至夏应助科研通管家采纳,获得10
12秒前
乐乐应助科研通管家采纳,获得10
12秒前
123应助科研通管家采纳,获得20
12秒前
乐乐应助科研通管家采纳,获得10
12秒前
顾矜应助科研通管家采纳,获得10
12秒前
8R60d8应助科研通管家采纳,获得10
12秒前
CodeCraft应助科研通管家采纳,获得10
12秒前
maox1aoxin应助傲娇的睫毛膏采纳,获得30
12秒前
bkagyin应助Yolo采纳,获得10
13秒前
李浅墨发布了新的文献求助10
13秒前
hhw发布了新的文献求助10
13秒前
13秒前
14秒前
14秒前
15秒前
16秒前
简单的季风完成签到 ,获得积分20
17秒前
18秒前
18秒前
高分求助中
Licensing Deals in Pharmaceuticals 2019-2024 3000
Effect of reactor temperature on FCC yield 2000
How Maoism Was Made: Reconstructing China, 1949-1965 800
Introduction to Spectroscopic Ellipsometry of Thin Film Materials Instrumentation, Data Analysis, and Applications 600
Promoting women's entrepreneurship in developing countries: the case of the world's largest women-owned community-based enterprise 500
Shining Light on the Dark Side of Personality 400
Analytical Model of Threshold Voltage for Narrow Width Metal Oxide Semiconductor Field Effect Transistors 350
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3310502
求助须知:如何正确求助?哪些是违规求助? 2943362
关于积分的说明 8514240
捐赠科研通 2618611
什么是DOI,文献DOI怎么找? 1431244
科研通“疑难数据库(出版商)”最低求助积分说明 664398
邀请新用户注册赠送积分活动 649616