计算机科学
变压器
一般化
虚假关系
人工智能
安全性令牌
模式识别(心理学)
机器学习
电压
数学
工程类
电气工程
数学分析
计算机安全
作者
Haoran Yu,Baodi Liu,Yingjie Wang,Kai Zhang,Dapeng Tao,Weifeng Liu
标识
DOI:10.1007/978-981-99-8543-2_27
摘要
Vision Transformer (ViT) has achieved amazing results in many visual applications where training and testing instances are drawn from the independent and identical distribution (I.I.D.). The performance will drop drastically when the distribution of testing instances is different from that of training ones in real open environments. To tackle this challenge, we propose a Stable Vision Transformer (SViT) for out-of-distribution (OOD) generalization. In particular, the SViT weights the samples to eliminate spurious correlations of token features in Vision Transformer and finally boosts the performance for OOD generalization. According to the structure and feature extraction characteristics of the ViT models, we design two forms of learning sample weights: SViT(C) and SViT(T). To demonstrate the effectiveness of two forms of SViT for OOD generalization, we conduct extensive experiments on the popular PACS and OfficeHome datasets and compare them with SOTA methods. The experimental results demonstrate the effectiveness of SViT(C) and SViT(T) for various OOD generalization tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI