Probabilistic Robustness for Free? Revisiting Training via a Benchmark

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

Trustworthy AI; Probabilistic Robustness; Benchmark

Deep learning models are notoriously vulnerable to imperceptible perturbations. Most existing research centers on adversarial robustness (AR), which evaluates models under worst-case scenarios by examining the existence of deterministic adversarial examples (AEs). In contrast, probabilistic robustness (PR) adopts a statistical perspective, measuring the probability that predictions remain correct under stochastic perturbations. While PR is widely regarded as a practical complement to AR, dedicated training methods for improving PR are still relatively underexplored, albeit with emerging progress. Among the few PR-targeted training methods, we identify three limitations: i) non‑comparable evaluation protocols; ii) limited comparisons to strong AT baselines despite anecdotal PR gains from AT, and; iii) no unified framework to compare the generalization of these methods. Thus, we introduce $\mathtt{PRBench}$ , the first benchmark dedicated to evaluating improvements in PR achieved by different robustness training methods. $\mathtt{PRBench}$ empirically compares most common AT and PR-targeted training methods using a comprehensive set of metrics, including clean accuracy, PR and AR performance, training efficiency, and generalization error (GE). We also provide theoretical analysis on the GE of PR performance across different training methods. Main findings revealed by $\mathtt{PRBench}$ include: AT methods are more versatile than PR-targeted training methods in terms of improving both AR and PR performance across diverse hyperparameter settings, while PR-targeted training methods consistently yield lower GE and higher clean accuracy. A leaderboard comprising 222 trained models across 7 datasets and 10 model architectures is publicly available at https://tmpspace.github.io/PRBenchLeaderboard/

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces PRBench, a benchmark for evaluating probabilistic robustness training methods. It resides in the 'Dedicated Probabilistic Robustness Training' leaf, which contains five papers total including this work. This leaf sits within the broader 'Probabilistic Robustness Training Methods' branch, indicating a relatively focused research direction. The taxonomy shows this is a moderately populated area, distinct from the larger 'Adversarial Training Approaches' branch with its multiple subtopics and numerous papers. The benchmark contribution targets a gap in systematic evaluation protocols for methods optimizing probabilistic rather than worst-case robustness metrics.

The taxonomy reveals neighboring work in 'Probabilistic Robustness Verification and Certification' (four papers) and 'Standard Adversarial Training' (five papers), suggesting the field balances empirical training methods with formal verification approaches. The 'Benchmarking and Evaluation Frameworks' leaf contains only two papers, highlighting limited prior work on systematic assessment tools. The paper bridges probabilistic training methods and evaluation frameworks, connecting to adversarial training comparisons while maintaining focus on statistical robustness measures. The taxonomy's scope and exclude notes clarify that this work differs from worst-case adversarial robustness benchmarks by emphasizing stochastic perturbation scenarios.

Among thirty candidates examined, the benchmark contribution (Contribution A) shows no clear refutation across ten candidates, suggesting novelty in comprehensive evaluation protocols. The theoretical generalization framework (Contribution B) encountered two refutable candidates among ten examined, indicating some overlap with existing generalization analysis literature. The risk-based training formulation (Contribution C) found no refutations in ten candidates. These statistics reflect a limited search scope focused on top semantic matches, not exhaustive coverage. The benchmark and formulation contributions appear more distinctive than the theoretical analysis component within this constrained examination.

Based on thirty candidates from semantic search, the work appears to occupy a relatively underexplored niche in probabilistic robustness evaluation. The taxonomy structure confirms sparse prior work in benchmarking frameworks specifically for probabilistic metrics. However, the limited search scope means potential overlaps in broader robustness literature or recent preprints may not be captured. The analysis suggests moderate novelty for the benchmark and formulation, with the theoretical component showing more substantial connections to existing generalization theory.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Benchmarking training methods for improving probabilistic robustness in deep learning. The field encompasses a diverse set of approaches organized into several major branches. Probabilistic Robustness Training Methods focus on techniques that explicitly target probabilistic guarantees and uncertainty quantification, including dedicated methods like Adversarial Probabilistic Training[1] and Intrinsic Probabilistic Robustness[24]. Adversarial Training Approaches emphasize defenses against worst-case perturbations, while Robust Training Under Data Quality Challenges addresses noisy labels and corrupted inputs through methods such as Probabilistic Data Filtering[9] and Feature Purification[10]. Bayesian and Probabilistic Deep Learning explores uncertainty estimation via Bayesian Deep Learning[8] and related frameworks like Uncertainty Baselines[4]. Architectural and Optimization Innovations introduce novel training strategies, including Sinusoidal Robust Training[5] and Lipschitz Bounds Training[30]. Benchmarking and Evaluation Frameworks provide systematic assessments, exemplified by works like Certified Robustness SoK[2] and Robust Deep Learning Competition[16]. Finally, Application-Specific Robustness tailors methods to domains such as medical imaging and graph neural networks. A particularly active line of work centers on certified probabilistic guarantees, where Tight Probabilistic Verification[3] and Certified Probabilistic Robustness[6] explore formal bounds on model behavior under distributional shifts. These contrast with empirical robustness approaches that prioritize practical performance on benchmarks without strict guarantees. The Probabilistic Robustness Benchmark[0] sits squarely within the Dedicated Probabilistic Robustness Training cluster, providing a systematic evaluation framework that complements theoretical works like Tight Probabilistic Verification[3] while offering more comprehensive empirical comparisons than the Probabilistic Robustness Guide[42]. Unlike Adversarial Probabilistic Training[1], which blends adversarial and probabilistic objectives, the benchmark emphasizes rigorous assessment across diverse training methods. Open questions remain around the trade-offs between computational cost, tightness of probabilistic bounds, and generalization to real-world distribution shifts, with ongoing efforts to bridge the gap between certified methods and scalable practical solutions.

Claimed Contributions

PRBench: First Benchmark for Probabilistic Robustness Training Methods

10 retrieved papers

The authors develop PRBench, the first systematic benchmark specifically designed to evaluate training methods for improving probabilistic robustness. It includes 222 trained models across 7 datasets and 10 architectures, evaluating methods using comprehensive metrics covering clean accuracy, PR and AR performance, training efficiency, and generalization error.

10 retrieved papers

Theoretical Generalization Error Analysis Framework

Can Refute

10 retrieved papers

The authors provide a unified theoretical framework based on Uniform Stability Analysis to derive generalization error bounds for different training methods. This includes theorems characterizing the Lipschitz and smoothness properties of adversarial training objectives with and without regularization, explaining why risk-based training methods achieve lower generalization error.

10 retrieved papers

Can Refute

General Formulation of Risk-based Training for Probabilistic Robustness

10 retrieved papers

The authors formalize a general mathematical framework (Definition 3) for risk-based training methods that target probabilistic robustness. This formulation unifies existing PR-targeted training approaches by defining them as minimizing statistical risks over distributional perturbations rather than worst-case adversarial examples.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Adversarial training for probabilistic robustness PDF

Zhang Yi, Chen Yuhang, Chen Zhen, Ruan, Wenjie, Huang Xiao-wei, Khastgir Siddartha, Zhao Xing-yu (2025)

[9] Probabilistic Robustness for Data Filtering PDF

Khan Abdul, Shahram Khadivi, Yu YU, Xu Jia (2023)

[24] Toward Intrinsic Adversarial Robustness Through Probabilistic Training PDF

Junhao Dong, Lingxiao Yang, Yuan Wang, Xiaohua Xie, Jianhuang Lai (2023)

[42] Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide PDF

Zhao Xing-yu, Xingyu Zhao (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

PRBench: First Benchmark for Probabilistic Robustness Training Methods

[2] Sok: Certified robustness for deep neural networks PDF

Cannot Refute

[9] Probabilistic Robustness for Data Filtering PDF

Cannot Refute

[42] Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide PDF

Cannot Refute

[60] Raid: A shared benchmark for robust evaluation of machine-generated text detectors PDF

Cannot Refute

[61] Safari: Versatile and efficient evaluations for robustness of interpretability PDF

Cannot Refute

[62] Adversarial glue: A multi-task benchmark for robustness evaluation of language models PDF

Cannot Refute

[63] Human uncertainty makes classification more robust PDF

Cannot Refute

[64] Benchmark generation framework with customizable distortions for image classifier robustness PDF

Cannot Refute

[65] Is synthetic data all we need? benchmarking the robustness of models trained with synthetic images PDF

Cannot Refute

[66] Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations PDF

Cannot Refute

Contribution

Theoretical Generalization Error Analysis Framework

[67] Stability and generalization in free adversarial training PDF

Can Refute

[76] On the Generalization Properties of Adversarial Training PDF

Can Refute

[68] Stability analysis and generalization bounds of adversarial training PDF

Cannot Refute

[69] Non-vacuous generalization bounds for adversarial risk in stochastic neural networks PDF

Cannot Refute

[70] Data-dependent stability analysis of adversarial training PDF

Cannot Refute

[71] Generalization analysis of adversarial pairwise learning PDF

Cannot Refute

[72] Improved ood generalization via adversarial training and pretraing PDF

Cannot Refute

[73] Generalization bounds for adversarial contrastive learning PDF

Cannot Refute

[74] A closer look at smoothness in domain adversarial training PDF

Cannot Refute

[75] Towards Sharper Generalization Bounds for Adversarial Contrastive Learning PDF

Cannot Refute

Contribution

General Formulation of Risk-based Training for Probabilistic Robustness

[1] Adversarial training for probabilistic robustness PDF

Cannot Refute

[51] Safedrive: Knowledge-and data-driven risk-sensitive decision-making for autonomous vehicles with large language models PDF

Cannot Refute

[52] Sustainability-Driven Risk-Based Design of Inerter Vibration Absorbers for Multistory Hysteretic Buildings under Seismic Hazard PDF

Cannot Refute

[53] Racer: Epistemic risk-sensitive rl enables fast driving with fewer crashes PDF

Cannot Refute

[54] Convergence of a robust deep FBSDE method for stochastic control PDF

Cannot Refute

[55] Risk-Constrained Reinforcement Learning With Augmented Lagrangian Multiplier for Portfolio Optimization PDF

Cannot Refute

[56] Intelligent Enhancement of Branch Transient Transmission Capacity Index Criterion for Transient Stability Discrimination: Robustness Augmentation and Critical Threshold Modification PDF

Cannot Refute

[57] Distributionally Robust Stochastic Optimization with Wasserstein Distance PDF

Cannot Refute

[58] Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty PDF

Cannot Refute

[59] Robust reinforcement learning: A review of foundations and recent advances PDF

Cannot Refute

Probabilistic Robustness for Free? Revisiting Training via a Benchmark

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Adversarial training for probabilistic robustness PDF

[9] Probabilistic Robustness for Data Filtering PDF

[24] Toward Intrinsic Adversarial Robustness Through Probabilistic Training PDF

[42] Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide PDF

Contribution Analysis

PRBench: First Benchmark for Probabilistic Robustness Training Methods

[2] Sok: Certified robustness for deep neural networks PDF

[9] Probabilistic Robustness for Data Filtering PDF

[42] Probabilistic Robustness in Deep Learning: A Concise yet Comprehensive Guide PDF

[60] Raid: A shared benchmark for robust evaluation of machine-generated text detectors PDF

[61] Safari: Versatile and efficient evaluations for robustness of interpretability PDF

[62] Adversarial glue: A multi-task benchmark for robustness evaluation of language models PDF

[63] Human uncertainty makes classification more robust PDF

[64] Benchmark generation framework with customizable distortions for image classifier robustness PDF

[65] Is synthetic data all we need? benchmarking the robustness of models trained with synthetic images PDF

[66] Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations PDF

Theoretical Generalization Error Analysis Framework

[67] Stability and generalization in free adversarial training PDF

[76] On the Generalization Properties of Adversarial Training PDF

[68] Stability analysis and generalization bounds of adversarial training PDF

[69] Non-vacuous generalization bounds for adversarial risk in stochastic neural networks PDF

[70] Data-dependent stability analysis of adversarial training PDF

[71] Generalization analysis of adversarial pairwise learning PDF

[72] Improved ood generalization via adversarial training and pretraing PDF

[73] Generalization bounds for adversarial contrastive learning PDF

[74] A closer look at smoothness in domain adversarial training PDF

[75] Towards Sharper Generalization Bounds for Adversarial Contrastive Learning PDF

General Formulation of Risk-based Training for Probabilistic Robustness

[1] Adversarial training for probabilistic robustness PDF

[51] Safedrive: Knowledge-and data-driven risk-sensitive decision-making for autonomous vehicles with large language models PDF

[52] Sustainability-Driven Risk-Based Design of Inerter Vibration Absorbers for Multistory Hysteretic Buildings under Seismic Hazard PDF

[53] Racer: Epistemic risk-sensitive rl enables fast driving with fewer crashes PDF

[54] Convergence of a robust deep FBSDE method for stochastic control PDF

[55] Risk-Constrained Reinforcement Learning With Augmented Lagrangian Multiplier for Portfolio Optimization PDF

[56] Intelligent Enhancement of Branch Transient Transmission Capacity Index Criterion for Transient Stability Discrimination: Robustness Augmentation and Critical Threshold Modification PDF

[57] Distributionally Robust Stochastic Optimization with Wasserstein Distance PDF

[58] Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty PDF

[59] Robust reinforcement learning: A review of foundations and recent advances PDF

Table of Contents