ASL-MDFD: Adversarial Self-Supervised Learning for Generalizable GAN-Resilient Multimodal Deepfake Detection

suryarprakash Nalluri; Aiman Shariff; Chethan  T.P; Aisirii V Hegde; Indu Mahesh

doi:10.63412/btg5gj03

Authors

suryarprakash Nalluri University of the Cumberland's Author https://orcid.org/0009-0004-1765-6783
Aiman Shariff Author
Chethan T.P Author
Aisirii V Hegde Author
Indu Mahesh Author

DOI:

https://doi.org/10.63412/btg5gj03

Keywords:

Deepfake Detection, Generative Adversarial Networks, Self-Supervised Learning, Adversarial Training, Multimodal Fusion, Cross-Dataset Generalization

Abstract

The rise of hyper-realistic synthetic media generated by Generative Adversarial Networks (GANs) and diffusion models poses significant challenges to deepfake detection systems, particularly in cross-dataset and cross-GAN generalization. In this work, we propose ASL-MDFD: a novel framework that unifies Adversarial training, Self-Supervised Learning (SSL), and Multimodal Fusion to detect deepfakes across diverse sources. Our approach leverages rotation prediction, patch shuffling recovery, and contrastive audio-visual alignment as pretext tasks to learn intrinsic representations without heavy reliance on labels. Simultaneously, adversarially perturbed examples generated using PGD simulate artifacts from unseen GANs, improving model robustness. The multimodal architecture integrates visual, audio, and temporal streams using cross-modal attention to detect inconsistencies in facial textures, voice artifacts, and motion dynamics. Evaluated across FaceForensics++, DFDC, Celeb-DF, StyleGAN3, and StarGANv2 datasets, ASL-MDFD achieves state-of-the-art performance, including 92.3% AUC on Celeb-DF and 88.7% accuracy on StyleGAN3 fakes, significantly outperforming existing baselines. Our results demonstrate the effectiveness of combining SSL, adversarial resilience, and multimodal cues in building robust, generalizable deepfake detectors.

Author Biography

suryarprakash Nalluri, University of the Cumberland's

SuryaPrakash Nalluri is a distinguished cybersecurity leader, independent researcher, and industry innovator with over 20 years of experience in cybersecurity, privacy, DevSecOps, and security architecture, specializing in the Banking and Finance sector. His expertise extends to IoT security, blockchain, artificial intelligence, intrusion detection, and cyber threat intelligence, contributing to advancements in secure digital ecosystems and financial cybersecurity.

He holds multiple patents in AI-driven security automation and data governance, pioneering solutions in secure software development, compliance frameworks, and AI-enhanced security analytics. A strong advocate for cybersecurity research, he serves as an Editorial Board Member for reputed journals and a Peer Reviewer for Springer and IEEE conferences, reviewing research papers, book chapters, and industry publications.

A Fellow of BCS, IE, and IETE, he actively bridges industry and academia, driving innovation in AI-driven security, governance, and risk management. His contributions have earned him international recognition for advancing cybersecurity practices and automation.

ASL-MDFD: Adversarial Self-Supervised Learning for Generalizable GAN-Resilient Multimodal Deepfake Detection

Authors

DOI:

Keywords:

Abstract

Author Biography

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Make a Submission