DeRaDiff: Denoising Time Realignment of Diffusion Models

ICLR 2026 Conference SubmissionAnonymous Authors
alignmentdiffusion models
Abstract:

Recent advances align diffusion models with human preferences to increase aesthetic appeal and mitigate artifacts and biases. Such methods aim to maximize a conditional output distribution aligned with higher rewards whilst not drifting far from a pretrained prior. This is commonly enforced by KL (Kullback–Leibler) regularization. As such, a central issue still remains: how does one choose the right regularization strength? Too high of a strength leads to limited alignment and too low of a strength leads to "reward hacking". This renders the task of choosing the correct regularization strength highly non-trivial. Existing approaches sweep over this hyperparameter by aligning a pretrained model at multiple regularization strengths and then choose the best strength. Unfortunately, this is prohibitively expensive. We introduce DeRaDiff, a denoising-time realignment procedure that, after aligning a pretrained model once, modulates the regularization strength during sampling to emulate models trained at other regularization strengths—without any additional training or fine-tuning. Extending decoding-time realignment from language to diffusion models, DeRaDiff operates over iterative predictions of continuous latents by replacing the reverse-step reference distribution by a geometric mixture of an aligned and reference posterior, thus giving rise to a closed-form update under common schedulers and a single tunable parameter, λ\lambda, for on-the-fly control. Our experiments show that across multiple text–image alignment and image-quality metrics, our method consistently provides a strong approximation for models aligned entirely from scratch at different regularization strengths. Thus, by enabling very precise inference-time control of the regularization strength, our method yields an efficient way to search for the optimal strength, eliminating the need for expensive alignment sweeps and thereby substantially reducing computational costs.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces DeRaDiff, a denoising-time realignment procedure that modulates regularization strength during sampling to emulate models trained at different alignment intensities without additional fine-tuning. Within the taxonomy, it resides in the 'Dynamic Regularization and Denoising-Time Realignment' leaf under 'Inference-Time Regularization and Alignment Strength Control'. This leaf contains only three papers total, indicating a relatively sparse research direction focused specifically on adaptive regularization mechanisms during inference rather than fixed-strength alignment or training-based methods.

The taxonomy reveals that DeRaDiff sits within a broader branch addressing inference-time regularization, which itself is one of five major approaches to alignment control. Neighboring leaves include 'Reinforcement Learning Guidance and Policy Control' (2 papers) focusing on RL-inspired steering, and more distant branches like 'Gradient-Based and Direct Optimization Methods' (7 papers) that refine noise or embeddings through differentiable objectives. The scope notes clarify that DeRaDiff's dynamic regularization approach excludes multi-preference balancing and fixed-strength methods, positioning it as a timestep-aware alternative to heavier optimization techniques like those in the gradient-based branch.

Among the 17 candidates examined across three contributions, no refutable prior work was identified. The core DeRaDiff procedure examined 4 candidates with 0 refutations; the theoretical extension to diffusion processes examined 3 candidates with 0 refutations; and the efficient hyperparameter exploration method examined 10 candidates with 0 refutations. This suggests that within the limited search scope, the specific combination of denoising-time realignment and closed-form updates for emulating multiple regularization strengths appears distinct from existing approaches, though the small candidate pool and sparse taxonomy leaf indicate this is an emerging rather than crowded research area.

Based on the limited literature search of 17 candidates, the work appears to occupy a relatively novel position within a sparse research direction. The taxonomy structure shows only two sibling papers in the same leaf, and the absence of refutable candidates across all contributions suggests differentiation from examined prior work. However, the small search scope and emerging nature of this specific branch mean that broader exhaustive searches or future work in dynamic regularization could reveal closer precedents.

Taxonomy

Core-task Taxonomy Papers
39
3
Claimed Contributions
17
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: inference-time control of alignment strength in diffusion models. The field has organized itself around several complementary strategies for steering diffusion model outputs without retraining. The taxonomy reveals five main branches: search and sampling methods that explore multiple generation trajectories (e.g., Latent Beam Search[1], Diffusion Tree Sampling[6]); gradient-based and direct optimization approaches that refine noise or latent codes (e.g., Direct Noise Optimization[11], DiffPO Efficient[13]); inference-time regularization techniques that dynamically adjust alignment strength during denoising; test-time adaptation methods that shift model behavior toward specific domains (e.g., TIDE[23], Test-time Deblurring[26]); and specialized techniques addressing particular applications like safety filtering or compositional generation. These branches reflect a fundamental tension between computational efficiency and fine-grained control, with search-based methods trading inference cost for quality while regularization approaches seek lightweight runtime adjustments. Recent work has concentrated on balancing alignment fidelity with generation diversity, particularly within the regularization and dynamic control branches. DeRaDiff[0] exemplifies this trend by introducing denoising-time realignment mechanisms that modulate alignment strength across timesteps, sitting alongside related dynamic approaches like Temporal Alignment Guidance[24] and Diffusion Blend[2] that similarly adjust guidance schedules during sampling. These methods contrast with heavier optimization-based techniques such as MIRA[10] or Reward-Guided Review[9], which iteratively refine outputs but incur greater computational overhead. A key open question across these branches concerns how to automatically determine optimal alignment schedules without manual tuning, while maintaining the distributional properties of the base model. DeRaDiff[0] addresses this by proposing adaptive regularization that responds to denoising progress, positioning itself within a small cluster of works exploring timestep-aware alignment modulation rather than static guidance scaling.

Claimed Contributions

DeRaDiff: Denoising-time realignment procedure for diffusion models

DeRaDiff is a method that enables on-the-fly adjustment of KL regularization strength during inference by geometrically mixing aligned and reference posterior distributions. This allows approximation of models trained at different regularization strengths without retraining, using a single tunable parameter lambda.

4 retrieved papers
Theoretical extension of decoding-time realignment to diffusion processes with closed-form update

The authors derive a tractable closed-form formula (Theorem 1) for the geometric mixture of reference and aligned diffusion model distributions at each denoising step. This provides both theoretical foundation and efficient implementation for realignment in continuous latent spaces under common schedulers.

3 retrieved papers
Efficient hyperparameter exploration method reducing computational costs

The method eliminates the need for expensive alignment sweeps by allowing search for optimal regularization strength at inference time. This yields substantial compute savings (66.7% to 90% GPU-hour reduction for exploring 3 to 10 regularization strengths) while maintaining performance across text-image alignment and image-quality metrics.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

DeRaDiff: Denoising-time realignment procedure for diffusion models

DeRaDiff is a method that enables on-the-fly adjustment of KL regularization strength during inference by geometrically mixing aligned and reference posterior distributions. This allows approximation of models trained at different regularization strengths without retraining, using a single tunable parameter lambda.

Contribution

Theoretical extension of decoding-time realignment to diffusion processes with closed-form update

The authors derive a tractable closed-form formula (Theorem 1) for the geometric mixture of reference and aligned diffusion model distributions at each denoising step. This provides both theoretical foundation and efficient implementation for realignment in continuous latent spaces under common schedulers.

Contribution

Efficient hyperparameter exploration method reducing computational costs

The method eliminates the need for expensive alignment sweeps by allowing search for optimal regularization strength at inference time. This yields substantial compute savings (66.7% to 90% GPU-hour reduction for exploring 3 to 10 regularization strengths) while maintaining performance across text-image alignment and image-quality metrics.