So-Fake: Benchmarking Social Media Image Forgery Detection
Overview
Overall Novelty Assessment
The paper introduces So-Fake, a large-scale dataset and evaluation framework for social media forgery detection, comprising over two million photorealistic images (So-Fake-Set) and a 100K out-of-domain benchmark (So-Fake-OOD). It resides in the Datasets and Benchmarks leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy. This leaf focuses exclusively on novel datasets and evaluation protocols, distinguishing it from the more crowded Detection Methodologies branch where algorithmic innovations dominate.
The taxonomy reveals that Datasets and Benchmarks sits alongside several methodological branches: Deep Learning-Based Detection Frameworks (with sub-branches for transfer learning, custom architectures, and ensemble models), Forgery Type-Specific Detection (covering GAN-generated content, deepfakes, and general forgeries), and Robustness and Adversarial Considerations. While detection methods proliferate across these neighboring leaves, the dataset branch remains comparatively underpopulated, suggesting that benchmark creation receives less attention than algorithmic development. The paper's emphasis on social media realism and out-of-domain generalization connects it to robustness concerns explored in adjacent branches, yet its primary contribution remains resource provision rather than methodological innovation.
Among the three contributions analyzed, the literature search examined 24 candidates total. The So-Fake benchmark (10 candidates examined, 2 refutable) and the multi-dimensional evaluation protocol (10 candidates examined, 2 refutable) both show some prior work overlap within this limited scope. The So-Fake-R1 baseline framework (4 candidates examined, 1 refutable) appears less contested, though the smaller candidate pool limits confidence. The scale of the dataset (2 million images) and the explicit focus on commercial model exclusion in the OOD benchmark represent distinguishing features, but the analysis does not exhaustively cover all possible prior datasets or evaluation schemes.
Based on the top-24 semantic matches examined, the work appears to occupy a moderately novel position within the sparse Datasets and Benchmarks leaf, though some contributions show overlap with existing resources. The limited search scope means that additional relevant datasets or evaluation protocols outside the candidate pool may exist. The taxonomy structure suggests that while detection methods are well-explored, comprehensive social media-oriented benchmarks remain less common, potentially increasing the practical value of this contribution despite the identified overlaps.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce So-Fake, a benchmark for social media forgery detection comprising So-Fake-Set (over 2 million images for training/validation across 12 categories and 30 generative methods) and So-Fake-OOD (100K images from Reddit paired with commercial generators for out-of-distribution evaluation). This benchmark addresses limitations in existing datasets by providing social-media-oriented data with multi-class labels, tampering masks, and explanatory annotations.
The authors propose So-Fake-R1, a reinforcement learning-based framework that unifies detection, localization, and explanation of social media forgeries. The method uses Group Relative Policy Optimization (GRPO) to jointly optimize across three tasks, producing interpretable predictions without requiring extensive manual annotations for explanations.
The authors establish a comprehensive evaluation protocol that assesses models across detection (classification), localization (tampering region identification), and explanation (interpretable rationales) tasks. This multi-dimensional approach addresses the inadequacy of existing protocols that focus primarily on binary classification or mask prediction without transparency.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[2] SMIFD: novel social media image forgery detection database PDF
[3] So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection PDF
[29] SocialDF: Benchmark Dataset and Detection Model for Mitigating Harmful Deepfake Content on Social Media Platforms PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
So-Fake benchmark with So-Fake-Set and So-Fake-OOD datasets
The authors introduce So-Fake, a benchmark for social media forgery detection comprising So-Fake-Set (over 2 million images for training/validation across 12 categories and 30 generative methods) and So-Fake-OOD (100K images from Reddit paired with commercial generators for out-of-distribution evaluation). This benchmark addresses limitations in existing datasets by providing social-media-oriented data with multi-class labels, tampering masks, and explanatory annotations.
[3] So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection PDF
[51] Df40: Toward next-generation deepfake detection PDF
[11] Deep Convolutional Pooling Transformer for Deepfake Detection PDF
[40] Detection of gan-generated fake images over social networks PDF
[52] Integrating audio-visual features for multimodal deepfake detection PDF
[53] ForensicHub: A Unified Benchmark & Codebase for All-Domain Fake Image Detection and Localization PDF
[54] Cored: Generalizing fake media detection with continual representation using distillation PDF
[55] Finefake: A knowledge-enriched dataset for fine-grained multi-domain fake news detection PDF
[56] Enhancing general face forgery detection via vision transformer with low-rank adaptation PDF
[57] Bertguard: two-tiered multi-domain fake news detection with class imbalance mitigation PDF
So-Fake-R1 baseline framework using reinforcement learning
The authors propose So-Fake-R1, a reinforcement learning-based framework that unifies detection, localization, and explanation of social media forgeries. The method uses Group Relative Policy Optimization (GRPO) to jointly optimize across three tasks, producing interpretable predictions without requiring extensive manual annotations for explanations.
[3] So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection PDF
[58] BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation PDF
[59] Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection PDF
[60] VeriChain: Reinforced Document Image Forgery Verification via Self-revealing Reasoning Chain PDF
Multi-dimensional evaluation protocol for social media forgery detection
The authors establish a comprehensive evaluation protocol that assesses models across detection (classification), localization (tampering region identification), and explanation (interpretable rationales) tasks. This multi-dimensional approach addresses the inadequacy of existing protocols that focus primarily on binary classification or mask prediction without transparency.