DSViT: An Enhanced Transformer Model for Deepfake Detection
计算机科学
变压器
可靠性工程
工程类
电气工程
电压
作者
Phạm Minh Thuấn,Lam Thu Bui,Pham Duy Trung
出处
期刊:Nghiên cứu khoa học và công nghệ trong lĩnh vực an toàn thông tin [Information Security Journal] 日期:2024-10-01卷期号:: 17-28
标识
DOI:10.54654/isj.v2i22.1055
摘要
The rapid development of artificial intelligence and deep learning models has enabled the creation of highly realistic fake images and videos, posing significant threats to information security and safety. Accurate detection of these forged contents is crucial to prevent the spread of misinformation and to protect the integrity of digital media. Although several advanced studies in this field, such as Vision Transformer (ViT) and Convolutional Vision Transformer (CViT), have been conducted, there remain limitations that need to be addressed. In this paper, we introduce a novel model, improved from CViT, designed to optimize the process of deepfake detection, named DSViT (Deepfake Detection with SC-based Convolutional Vision Transformer). This model judiciously integrates Convolutions and a SCConvolution block with the ViT architecture. We conducted experiments on the Deepfake Detection Challenge (DFDC) dataset and compared the results with the CViT model to demonstrate the effectiveness of the proposed model