计算机科学
人工智能
变压器
一般化
卷积(计算机科学)
机器学习
模式识别(心理学)
计算机视觉
人工神经网络
数学
量子力学
物理
数学分析
电压
作者
Hui Lin,Huang We,Weiqi Luo,Wei Lü
标识
DOI:10.1016/j.dsp.2022.103895
摘要
With the help of some modern image generative techniques, it is possible to generate or manipulate image or video contents without introducing any obvious visual artifacts. If these manipulated images/videos are abused, it probably has a huge negative impact on society and individuals. Thus, deepfake detection has attracted considerable attention in recent years. Although the existing methods can achieve good detection performance on high-quality datasets, they are still far from satisfactory for low-quality dataset and cross-dataset evaluation. In this paper, therefore, we propose a new CNN-based method via multi-scale convolution and vision transformer for deepfake detection. In the proposed model, we design a multi-scale module with dilation convolution and depthwise separable convolution to capture more face details and tampering artifacts at different scales. Unlike the traditional classification module, furthermore, we employ a vision transformer to further learn the global information of face features for classification. Extensive experiments demonstrate that in most cases the proposed method achieves better detection results on both high-quality and low-quality datasets compared with related modern methods, and the cross-dataset generalization capabilities of the proposed method are good. In addition, many ablation experiments are provided to verify the rationality of the proposed network.
科研通智能强力驱动
Strongly Powered by AbleSci AI