So-Fake: Benchmarking Social Media Image Forgery Detection

ICLR 2026 Conference SubmissionAnonymous Authors
LLMsDeepfake detection
Abstract:

Recent advances in AI-powered generative models have enabled the creation of increasingly realistic synthetic images, posing significant risks to information integrity and public trust on social media platforms. While robust detection frameworks and diverse, large-scale datasets are essential to mitigate these risks, existing academic efforts remain limited in scope: current datasets lack the diversity, scale, and realism required for social media contexts, and evaluation protocols rarely account for explanation or out-of-domain generalization. To bridge this gap, we introduce \textbf{So-Fake}, a comprehensive social media-oriented dataset for forgery detection consisting of two key components. First, we present \textbf{So-Fake-Set}, a large-scale dataset with over \textbf{2 million} photorealistic images from diverse generative sources, synthesized using a wide range of generative models. Second, to rigorously evaluate cross-domain robustness, we establish \textbf{So-Fake-OOD}, a novel and large-scale (\textbf{100K}) out-of-domain benchmark sourced from real social media platforms and featuring synthetic imagery from commercial models explicitly excluded from the training distribution, creating a realistic testbed that mirrors actual deployment scenarios. Leveraging these complementary datasets, we present \textbf{So-Fake-R1}, a baseline framework that applies reinforcement learning to encourage interpretable visual rationales. Experiments show that So-Fake surfaces substantial challenges for existing methods. By integrating a large-scale dataset, a realistic out-of-domain benchmark, and a multi-dimensional evaluation protocol, So-Fake establishes a new foundation for social media forgery detection research.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces So-Fake, a large-scale dataset and evaluation framework for social media forgery detection, comprising over two million photorealistic images (So-Fake-Set) and a 100K out-of-domain benchmark (So-Fake-OOD). It resides in the Datasets and Benchmarks leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy. This leaf focuses exclusively on novel datasets and evaluation protocols, distinguishing it from the more crowded Detection Methodologies branch where algorithmic innovations dominate.

The taxonomy reveals that Datasets and Benchmarks sits alongside several methodological branches: Deep Learning-Based Detection Frameworks (with sub-branches for transfer learning, custom architectures, and ensemble models), Forgery Type-Specific Detection (covering GAN-generated content, deepfakes, and general forgeries), and Robustness and Adversarial Considerations. While detection methods proliferate across these neighboring leaves, the dataset branch remains comparatively underpopulated, suggesting that benchmark creation receives less attention than algorithmic development. The paper's emphasis on social media realism and out-of-domain generalization connects it to robustness concerns explored in adjacent branches, yet its primary contribution remains resource provision rather than methodological innovation.

Among the three contributions analyzed, the literature search examined 24 candidates total. The So-Fake benchmark (10 candidates examined, 2 refutable) and the multi-dimensional evaluation protocol (10 candidates examined, 2 refutable) both show some prior work overlap within this limited scope. The So-Fake-R1 baseline framework (4 candidates examined, 1 refutable) appears less contested, though the smaller candidate pool limits confidence. The scale of the dataset (2 million images) and the explicit focus on commercial model exclusion in the OOD benchmark represent distinguishing features, but the analysis does not exhaustively cover all possible prior datasets or evaluation schemes.

Based on the top-24 semantic matches examined, the work appears to occupy a moderately novel position within the sparse Datasets and Benchmarks leaf, though some contributions show overlap with existing resources. The limited search scope means that additional relevant datasets or evaluation protocols outside the candidate pool may exist. The taxonomy structure suggests that while detection methods are well-explored, comprehensive social media-oriented benchmarks remain less common, potentially increasing the practical value of this contribution despite the identified overlaps.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
24
Contribution Candidate Papers Compared
5
Refutable Paper

Research Landscape Overview

Core task: social media image forgery detection. The field has evolved into a multi-faceted landscape organized around several major branches. Detection Methodologies and Architectures explores algorithmic innovations, from traditional forensic techniques to deep learning models that learn manipulation artifacts. Forgery Type-Specific Detection targets particular manipulation classes such as deepfakes, GAN-generated content, and copy-move forgeries, with works like GAN Deepfake Detection[9] and GAN-Generated Detection[40] exemplifying specialized approaches. The Datasets and Benchmarks branch provides critical evaluation resources, including SMIFD Database[2] and newer collections like SocialDF Benchmark[29], which enable reproducible comparisons. Parallel branches address broader contexts: Misinformation and Fake News Detection examines multimodal disinformation ecosystems, while Social Impact, Human Factors, and Applications consider real-world deployment challenges and user trust. Surveys, Reviews, and Meta-Analyses synthesize progress across these areas, and Digital Forensics Frameworks and Cybersecurity situate forgery detection within larger security infrastructures. Recent activity highlights tensions between generalization and specialization. Many studies pursue robust architectures that handle diverse manipulations and social media degradations, as seen in Robust Ensemble Model[24] and Robust Online Detection[30], while others refine type-specific detectors for emerging threats like deepfakes (Deepfake Forensics Survey[28], SIDA Deepfake[7]). The Datasets and Benchmarks branch remains particularly active, with SoFake Benchmarking[0] contributing new evaluation protocols alongside neighbors like SoFake Explaining[3], which adds interpretability dimensions to benchmark design. Compared to earlier datasets such as SMIFD Database[2], these newer efforts emphasize social media realism and explainability. Open questions persist around cross-platform generalization, adversarial robustness, and bridging the gap between technical detection capabilities and practical deployment in high-stakes misinformation scenarios.

Claimed Contributions

So-Fake benchmark with So-Fake-Set and So-Fake-OOD datasets

The authors introduce So-Fake, a benchmark for social media forgery detection comprising So-Fake-Set (over 2 million images for training/validation across 12 categories and 30 generative methods) and So-Fake-OOD (100K images from Reddit paired with commercial generators for out-of-distribution evaluation). This benchmark addresses limitations in existing datasets by providing social-media-oriented data with multi-class labels, tampering masks, and explanatory annotations.

10 retrieved papers
Can Refute
So-Fake-R1 baseline framework using reinforcement learning

The authors propose So-Fake-R1, a reinforcement learning-based framework that unifies detection, localization, and explanation of social media forgeries. The method uses Group Relative Policy Optimization (GRPO) to jointly optimize across three tasks, producing interpretable predictions without requiring extensive manual annotations for explanations.

4 retrieved papers
Can Refute
Multi-dimensional evaluation protocol for social media forgery detection

The authors establish a comprehensive evaluation protocol that assesses models across detection (classification), localization (tampering region identification), and explanation (interpretable rationales) tasks. This multi-dimensional approach addresses the inadequacy of existing protocols that focus primarily on binary classification or mask prediction without transparency.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

So-Fake benchmark with So-Fake-Set and So-Fake-OOD datasets

The authors introduce So-Fake, a benchmark for social media forgery detection comprising So-Fake-Set (over 2 million images for training/validation across 12 categories and 30 generative methods) and So-Fake-OOD (100K images from Reddit paired with commercial generators for out-of-distribution evaluation). This benchmark addresses limitations in existing datasets by providing social-media-oriented data with multi-class labels, tampering masks, and explanatory annotations.

Contribution

So-Fake-R1 baseline framework using reinforcement learning

The authors propose So-Fake-R1, a reinforcement learning-based framework that unifies detection, localization, and explanation of social media forgeries. The method uses Group Relative Policy Optimization (GRPO) to jointly optimize across three tasks, producing interpretable predictions without requiring extensive manual annotations for explanations.

Contribution

Multi-dimensional evaluation protocol for social media forgery detection

The authors establish a comprehensive evaluation protocol that assesses models across detection (classification), localization (tampering region identification), and explanation (interpretable rationales) tasks. This multi-dimensional approach addresses the inadequacy of existing protocols that focus primarily on binary classification or mask prediction without transparency.