SAFETY-GUIDED FLOW (SGF): A UNIFIED FRAMEWORK FOR NEGATIVE GUIDANCE IN SAFE GENERATION

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Safe generationflow matchingcontrol barrier functions

Safety mechanisms for diffusion and flow models have recently been developed along two distinct paths. In robot planning, control barrier functions are employed to guide generative trajectories away from obstacles at every denoising step by explicitly imposing geometric constraints. In parallel, recent data-driven, negative guidance approaches have been shown to suppress harmful content and promote diversity in generated samples. However, they rely on heuristics without clearly stating when safety guidance is actually necessary. In this paper, we first introduce a unified probabilistic framework using a Maximum Mean Discrepancy (MMD) potential for image generation tasks that recasts both Shielded Diffusion and Safe Denoiser as instances of our energy-based negative guidance against unsafe data samples. Furthermore, we leverage control-barrier functions analysis to justify the existence of a critical time window in which negative guidance must be strong; outside of this window, the guidance should decay to zero to ensure safe and high-quality generation. We evaluate our unified framework on several realistic safe generation scenarios, confirming that negative guidance should be applied in the early stages of the denoising process for successful safe generation.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a unified probabilistic framework using Maximum Mean Discrepancy (MMD) potentials to formalize negative guidance in diffusion and flow models, explicitly connecting prior heuristic methods like Shielded Diffusion and Safe Denoiser under a single energy-based lens. It resides in the 'Unified Frameworks and Energy-Based Formulations' leaf, which contains only one other sibling paper among the 37 total papers surveyed. This positioning suggests the work occupies a relatively sparse research direction focused on theoretical unification rather than application-specific implementations or concept removal techniques.

The taxonomy reveals that most neighboring work clusters around dynamic timing strategies, attention-based interventions, and classifier-free guidance extensions—all within the broader 'Negative Guidance Mechanisms and Theoretical Foundations' branch. The paper's energy-based formulation distinguishes it from attention-manipulation methods and prompt-engineering approaches, which dominate adjacent leaves. Its control-barrier function analysis bridges theoretical foundations with the timing-focused subcategory, suggesting cross-pollination between geometric safety constraints (common in robotics) and probabilistic guidance frameworks for generative models.

Among 13 candidates examined across three contributions, none were flagged as clearly refuting the paper's claims. The MMD-based unification examined 3 candidates with no refutations; the equivalence propositions examined 1 candidate; and the control-barrier timing analysis examined 9 candidates, again with no overlapping prior work identified. This limited search scope—focused on top-K semantic matches—suggests that within the examined literature, the combination of MMD potentials, formal equivalence proofs, and barrier-function timing analysis appears relatively novel, though exhaustive coverage of related robotics or control-theoretic safety literature may lie outside this search.

Given the sparse population of the unified-framework leaf and the absence of refuting candidates among 13 examined papers, the work appears to occupy a distinct niche at the intersection of energy-based guidance theory and control-theoretic safety analysis. However, the analysis is constrained by the limited search scope and may not capture all relevant prior work in adjacent fields such as robotics planning or formal verification, where barrier functions are more established.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: negative guidance in safe generation for diffusion and flow models. The field addresses how to steer generative models away from undesirable outputs—such as harmful, biased, or off-topic content—while preserving generation quality. The taxonomy organizes this landscape into several main branches. Negative Guidance Mechanisms and Theoretical Foundations explores the underlying mathematical frameworks, including energy-based formulations and unified guidance strategies that provide principled ways to incorporate safety constraints during sampling. Concept Erasure and Content Filtering focuses on methods that remove or suppress specific unwanted concepts, often through training-free interventions or fine-tuning approaches like Erasing concepts from diffusion[1] and Bi-Erasing[26]. Application Domains and Task-Specific Implementations covers specialized uses in image synthesis, video generation, and multimodal tasks, where negative guidance is adapted to domain-specific safety requirements. Alternative Generative Paradigms and Related Methods examines how similar ideas appear in non-diffusion settings, such as language models or other generative architectures. Within the theoretical branch, a dense cluster of works investigates how to formulate negative guidance as an energy-based optimization problem, balancing safety objectives with sample fidelity. SAFETY-GUIDED FLOW SGF[0] sits squarely in this unified framework subarea, proposing a principled energy formulation for flow models that contrasts with earlier heuristic approaches. Nearby, Dont be so negative[20] examines potential pitfalls of naive negative prompting, highlighting trade-offs between suppression strength and generation coherence. Other works like Adaptive guidance[2] and Training-free safe denoisers[4] explore dynamic or training-free strategies that adjust guidance intensity on the fly, addressing the challenge of maintaining output diversity while enforcing safety constraints. Across these lines, a recurring theme is the tension between strong negative steering—which can degrade sample quality or introduce artifacts—and weaker interventions that may fail to eliminate harmful content, with SAFETY-GUIDED FLOW SGF[0] contributing a theoretically grounded middle path for flow-based generation.

Claimed Contributions

Unified probabilistic framework using MMD potential for negative guidance

3 retrieved papers

The authors propose an energy-based formulation of negative guidance using the Maximum Mean Discrepancy (MMD) potential. This framework unifies existing methods (Shielded Diffusion and Safe Denoiser) by showing they are special cases of gradient-based repulsion from unsafe data samples in kernel feature space.

3 retrieved papers

Propositions establishing equivalence between MMD gradient and existing repulsive fields

1 retrieved paper

The authors provide formal propositions demonstrating that the gradient of their MMD potential recovers both Safe Denoiser's weighted kernel repulsion and Shielded Diffusion's radial repulsion under appropriate conditions, establishing mathematical connections between these previously disparate approaches.

1 retrieved paper

Control-barrier function analysis justifying critical time window for guidance

9 retrieved papers

The authors apply control-barrier function theory to formally characterize when negative guidance should be applied during generation. They prove that guidance is most effective early in the denoising process and should decay afterward, providing theoretical justification for the critical time window rather than relying on heuristics.

9 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[20] Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance PDF

Naderiparizi, Saeid, Saeid Naderiparizi, Liang, Xiaoxuan, Xiaoxuan Liang, Berend Zwartsenberg, Zwartsenberg, Berend, Frank Wood, Wood, Frank (2023)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Unified probabilistic framework using MMD potential for negative guidance

[39] Spatio-temporal energy-guided diffusion model for zero-shot video synthesis and editing PDF

Cannot Refute

[40] Deep MMD gradient flow without adversarial training PDF

Cannot Refute

[41] Revisiting Maximum Mean Discrepancy via Diffusion Behavior Policy in Offline RL: A Mode-Seeking Perspective PDF

Cannot Refute

Contribution

Propositions establishing equivalence between MMD gradient and existing repulsive fields

[38] ReBaPL: Repulsive Bayesian Prompt Learning PDF

Cannot Refute

Contribution

Control-barrier function analysis justifying critical time window for guidance

[42] Dynamic High-Order Control Barrier Functions With Diffuser for Safety-Critical Trajectory Planning at Signal-Free Intersections PDF

Cannot Refute

[43] Safe offline reinforcement learning using trajectory-level diffusion models PDF

Cannot Refute

[44] Constrained Diffusers for Safe Planning and Control PDF

Cannot Refute

[45] Pure Theory for Liberation from Fundamental Suffering in Humans and the Absence of Fundamental Suffering in AI PDF

Cannot Refute

[46] EB-MBD: Emerging-Barrier Model-Based Diffusion for Safe Trajectory Optimization in Highly Constrained Environments PDF

Cannot Refute

[47] A microfluidic multi-injector for gradient generation PDF

Cannot Refute

[48] Diffusion Model in Robotics: A Comprehensive Review PDF

Cannot Refute

[49] MA-SafeDiffuser: Safe Multi-Agent Planning with Diffusion Probabilistic Models PDF

Cannot Refute

[50] Integrating Diffusion Models into Model-Based Reinforcement Learning for Real-Time Robotic Control A Theoretical Review PDF

Cannot Refute

SAFETY-GUIDED FLOW (SGF): A UNIFIED FRAMEWORK FOR NEGATIVE GUIDANCE IN SAFE GENERATION

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[20] Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance PDF

Contribution Analysis

Unified probabilistic framework using MMD potential for negative guidance

[39] Spatio-temporal energy-guided diffusion model for zero-shot video synthesis and editing PDF

[40] Deep MMD gradient flow without adversarial training PDF

[41] Revisiting Maximum Mean Discrepancy via Diffusion Behavior Policy in Offline RL: A Mode-Seeking Perspective PDF

Propositions establishing equivalence between MMD gradient and existing repulsive fields

[38] ReBaPL: Repulsive Bayesian Prompt Learning PDF

Control-barrier function analysis justifying critical time window for guidance

[42] Dynamic High-Order Control Barrier Functions With Diffuser for Safety-Critical Trajectory Planning at Signal-Free Intersections PDF

[43] Safe offline reinforcement learning using trajectory-level diffusion models PDF

[44] Constrained Diffusers for Safe Planning and Control PDF

[45] Pure Theory for Liberation from Fundamental Suffering in Humans and the Absence of Fundamental Suffering in AI PDF

[46] EB-MBD: Emerging-Barrier Model-Based Diffusion for Safe Trajectory Optimization in Highly Constrained Environments PDF

[47] A microfluidic multi-injector for gradient generation PDF

[48] Diffusion Model in Robotics: A Comprehensive Review PDF

[49] MA-SafeDiffuser: Safe Multi-Agent Planning with Diffusion Probabilistic Models PDF

[50] Integrating Diffusion Models into Model-Based Reinforcement Learning for Real-Time Robotic Control A Theoretical Review PDF

Table of Contents