计算机科学
蒸馏
融合
人工智能
化学
色谱法
语言学
哲学
作者
Zuxiang Long,Fuyan Ma,Bin Sun,Mingkui Tan,Shutao Li
标识
DOI:10.1016/j.inffus.2022.09.007
摘要
Knowledge distillation improves the performance of a compact student network by adding supervision from a pre-trained cumbersome teacher network during training. To avoid the resource consumption of acquiring an extra teacher network, the self-knowledge distillation designs a multi-branch network architecture with shared layers for teacher and student models, which are trained collaboratively in a one-stage manner. However, this method ignores the knowledge of shallow branches and rarely provides diverse knowledge for effective collaboration of different branches. To solve these two shortcomings, this paper proposes a novel Diversified Branch Fusion approach for Self-Knowledge Distillation (DBFSKD). Firstly, we design lightweight networks for adding to the middle layers of the backbone. They capture discriminative information by global-local attention. Then we introduce a diversity loss between different branches to explore diverse knowledge. Moreover, the diverse knowledge is further integrated to form two knowledge sources by a Selective Feature Fusion (SFF) and a Dynamic Logits Fusion (DLF). Thus, the significant knowledge of shallow branches is efficiently utilized and all branches learn from each other through the fused knowledge sources. Extensive experiments with various backbone structures on four public datasets (CIFAR100, Tiny-ImageNet200, ImageNet, and RAF-DB) show superior performance of the proposed method over other methods. More importantly, the DBFSKD achieves even better performance with fewer resource consumption than the baseline. • Diversified branch fusion approach is proposed for self-knowledge distillation. • Shallow branches provide complementary information for the deep ones. • Feature and logits level fusion provides richer knowledge source for distillation. • Diversity loss encourages the branches to explore diverse knowledge. • DBFSKD obtains SOTA results in the facial expression recognition application.
科研通智能强力驱动
Strongly Powered by AbleSci AI