Computational Bottlenecks for Denoising Diffusions

ICLR 2026 Conference SubmissionAnonymous Authors
denoising diffusionscomputational bottlenecksinformation-computation gapspiked Wigner modelscore matching
Abstract:

Denoising diffusions sample from a probability distribution μ\mu in Rd\mathbb{R}^d by constructing a stochastic process (x^t:t0)(\hat{\mathbf{x}}_t:t\ge 0) in Rd\mathbb{R}^d such that x^0\hat{\mathbf{x}}_0 is easy to sample, but the distribution of x^T\hat{\mathbf{x}}_T at large TT approximates μ\mu. The drift m:Rd×RRd\mathbf{m}:\mathbb{R}^{d}\times\mathbb{R}\to\mathbb{R}^d of this diffusion process is learned by minimizing a score-matching objective.

Is every probability distribution μ\mu, for which sampling is tractable, also amenable to sampling via diffusions? We address this question by studying its relation to information-computation gaps in statistical estimation. Earlier work in this area constructs broad families of distributions μ\mu for which sampling is easy, but approximating the drift m(y,t)\mathbf{m}(\mathbf{y},t) is conjectured to be intractable, and provides rigorous evidence for intractability.

We prove that this implies a failure of sampling via diffusions. First, there exist drifts whose score matching objective is superpolynomially close to the optimum value (among polynomial time drifts) and yet yield samples with distribution that is very far from the target one. Second, any polynomial-time drift that is also Lipschitz continuous results in equally incorrect sampling.

We instantiate our results on the toy problem of sampling a sparse low-rank matrix, and further demonstrate empirically the failure of diffusion-based sampling. Our work implies that caution should be used in adopting diffusion sampling when other approaches are available.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper establishes fundamental computational limits for denoising diffusion sampling by proving that near-optimal score-matching objectives can yield poor samples, and that Lipschitz-constrained optimizers face inherent barriers. It resides in the 'Information-Computation Gaps and Intractability' leaf under 'Theoretical Foundations and Computational Complexity', where it is currently the sole paper. This placement reflects a sparse research direction: while the broader taxonomy contains 50 papers across 22 leaves, rigorous intractability results for diffusion sampling remain underexplored compared to the heavily populated acceleration and application branches.

The taxonomy reveals that most neighboring work focuses on practical efficiency rather than fundamental limits. The sibling branch 'Sampling Schedule Design and Optimization' contains three papers on timestep optimization, while the parent-level peer 'Sampling Acceleration Techniques' encompasses 13 papers across four leaves addressing non-Markovian methods, distillation, and training-free speedups. The paper's theoretical stance—proving impossibility results rather than proposing heuristic improvements—distinguishes it from these empirical directions. Its connection to statistical estimation gaps positions it at the intersection of complexity theory and generative modeling, a boundary less populated in the current taxonomy.

Among 22 candidates examined across three contributions, none were identified as clearly refuting the paper's claims. The first contribution (near-optimal score-matching with poor sampling) examined 10 candidates with zero refutations; the second (Lipschitz impossibility) examined 2 with zero refutations; the third (estimation-to-sampling reduction) examined 10 with zero refutations. This suggests that within the limited search scope—focused on top semantic matches and citations—the specific theoretical constructions appear novel. However, the small candidate pool (22 total) and the paper's position as the sole occupant of its taxonomy leaf indicate that the literature search may not have captured all relevant complexity-theoretic work outside the diffusion-specific domain.

Based on the examined candidates, the paper's theoretical contributions appear substantively novel within the diffusion sampling literature. The absence of refuting work among 22 candidates, combined with its unique taxonomy position, suggests it addresses a gap in rigorous complexity analysis. However, this assessment is constrained by the search scope: broader complexity theory or statistical lower bounds literature beyond the top-22 semantic matches may contain related impossibility results. The analysis captures novelty relative to the diffusion efficiency community but cannot claim exhaustiveness across all theoretical computer science.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
22
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: computational limitations of denoising diffusion sampling. Denoising diffusion models have achieved remarkable generative quality, yet their iterative sampling process imposes substantial computational burdens that limit practical deployment. The field's taxonomy reflects a multifaceted response to these challenges. At the theoretical level, researchers investigate fundamental complexity bounds and information-computation gaps, as exemplified by Computational Bottlenecks[0], which examines intrinsic intractability. Meanwhile, a large body of work focuses on sampling acceleration techniques—ranging from deterministic shortcuts like DDIM[13] and step-aware scheduling methods such as Optimized Time Steps[24] to distillation approaches like Direct Distillation[25]—that reduce the number of function evaluations without sacrificing output fidelity. Model architecture efficiency explores lightweight designs and caching strategies (e.g., Hybrid Feature Caching[28]), while conditional and constrained sampling addresses the added cost of guidance and constraint satisfaction (Fast Constrained Sampling[1]). Application-specific branches tailor efficiency adaptations to domains like video generation (Video Diffusion Efficiency[14]) and medical imaging (MedDiff[44]), and training improvements seek to learn more efficient noise schedules or parameterizations from the outset. Emerging paradigms extend diffusion ideas to novel settings, including quantum frameworks (Quantum Denoising[2]). Across these branches, a central tension emerges between sample quality, computational cost, and generalization: aggressive step reduction or architectural pruning can degrade fidelity, while domain-specific optimizations may not transfer broadly. Computational Bottlenecks[0] sits squarely within the theoretical foundations branch, probing the hardness barriers that underpin why certain sampling regimes remain expensive despite algorithmic ingenuity. This contrasts with purely empirical acceleration work like DDIM[13] or distillation methods, which prioritize practical speedups, and with application-driven studies such as DiffusionDepth[26] or Precipitation Nowcasting Transformer[35], which accept domain constraints. By formalizing information-computation trade-offs, Computational Bottlenecks[0] complements survey perspectives like Efficient Diffusion Survey[8] and provides a rigorous lens through which to interpret why many heuristic speedups succeed in practice yet fail to overcome worst-case complexity. Understanding these fundamental limits helps clarify which efficiency gains are algorithmic breakthroughs and which merely exploit favorable problem structure.

Claimed Contributions

Existence of near score-matching optimizers that sample incorrectly

The authors prove that for distributions presenting information-computation gaps, there exist polynomial-time computable drifts that nearly optimize the score-matching objective among polytime algorithms yet generate samples with distribution far from the target distribution in Wasserstein-1 distance.

10 retrieved papers
Impossibility result for Lipschitz score-matching optimizers

The authors establish that any polynomial-time drift that near-optimizes score matching, acts optimally on pure noise, and satisfies a Lipschitz continuity condition must also fail to sample correctly from the target distribution.

2 retrieved papers
Reduction from estimation to diffusion sampling

The authors prove a general reduction showing that if diffusion-based sampling can be performed accurately in polynomial time, then near Bayes-optimal denoising estimation must also be possible in polynomial time, establishing a formal connection between sampling and estimation complexity.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Existence of near score-matching optimizers that sample incorrectly

The authors prove that for distributions presenting information-computation gaps, there exist polynomial-time computable drifts that nearly optimize the score-matching objective among polytime algorithms yet generate samples with distribution far from the target distribution in Wasserstein-1 distance.

Contribution

Impossibility result for Lipschitz score-matching optimizers

The authors establish that any polynomial-time drift that near-optimizes score matching, acts optimally on pure noise, and satisfies a Lipschitz continuity condition must also fail to sample correctly from the target distribution.

Contribution

Reduction from estimation to diffusion sampling

The authors prove a general reduction showing that if diffusion-based sampling can be performed accurately in polynomial time, then near Bayes-optimal denoising estimation must also be possible in polynomial time, establishing a formal connection between sampling and estimation complexity.