可转让性
对抗制
特征(语言学)
师(数学)
计算机科学
图层(电子)
人工智能
模式识别(心理学)
数据挖掘
机器学习
数学
算术
语言学
哲学
化学
有机化学
罗伊特
作者
Zikang Jin,Changchun Yin,Piji Li,Lu Zhou,Liming Fang,Xiangmao Chang,Zhe Liu
标识
DOI:10.1109/icassp49357.2023.10096459
摘要
Improving the transferability of adversarial examples for the purpose of attacking unknown black-box models has been intensively studied. In particular, feature-level transfer-based attacks, which destroy the intermediate feature outputs of source models, are proven to generate more transferable adversarial examples. However, existing state-of-the-art feature-level attacks only destroy a single intermediate layer, this severely limits the transferability of adversarial examples. And all of these attacks have a vague distinction between positive and negative features. By contrast, we propose the Multi-layer Feature Division Attack (MFDA), which aggregates multi-layer feature information on the basis of feature division to attack. Extensive experimental evaluation demonstrates that MFDA can significantly boost the adversarial transferability and quantitatively distinguish the effects of positive and negative features on transferability. Compared to the state-of-the-art feature-level attacks, our improvement methods with MFDA increase the average success rate by 2.8% against normally trained models and 3.0% against adversarially trained models.
科研通智能强力驱动
Strongly Powered by AbleSci AI