Inference-time scaling of diffusion models through classical search

ICLR 2026 Conference SubmissionAnonymous Authors
diffusion modelsinference-time scalingcompositional generationsearch algorithms
Abstract:

Classical search algorithms have long underpinned modern artificial intelligence. In this work, we tackle the challenge of inference-time control in diffusion models—adapting generated outputs to meet diverse test-time objectives—using principles from classical search. We propose a general framework that orchestrates local and global search to efficiently navigate the generative space. It performs compute-efficient global exploration using breadth-first and depth-first tree search and employs a theoretically grounded, scalable local search via annealed Langevin MCMC. We evaluate our approach on a range of challenging domains, including planning, offline reinforcement learning, and image generation, and observe significant gains in both performance and efficiency over baseline methods. These results demonstrate that classical search offers a principled and practical foundation for inference-time scaling in diffusion models. By jointly scaling local and global search for the first time, our framework establishes a new Pareto frontier across challenging decision-making domains.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a general framework that orchestrates local and global search for inference-time control in diffusion models, combining breadth-first and depth-first tree search with annealed Langevin MCMC. It resides in the 'Tree Search for Reward-Guided Generation' leaf, which contains five papers including the original work. This leaf sits within a broader cluster of tree search and Monte Carlo methods, indicating a moderately active research direction. The taxonomy shows this is one of several approaches to search-based alignment, with sibling categories exploring discrete diffusion, Monte Carlo guidance, and evolutionary methods.

The taxonomy reveals neighboring research directions that contextualize this work. The parent category 'Tree Search and Monte Carlo Methods for Alignment' encompasses discrete language diffusion and stochastic search guidance, while sibling branches explore evolutionary algorithms and noise trajectory optimization. The 'Inference-Time Scaling and Adaptive Computation' branch addresses related questions about computational allocation during sampling. The scope notes clarify that this leaf focuses on continuous reward-guided generation, excluding gradient-based methods and training-time optimization, which positions the work at the intersection of classical search theory and modern generative modeling.

Among twenty-one candidates examined, the contribution-level analysis shows mixed novelty signals. The general framework orchestrating local and global search examined ten candidates with none clearly refuting it, suggesting this high-level integration may be relatively novel within the limited search scope. However, the adaptive DFS algorithm claim examined one candidate that appears to refute it, and the annealed Langevin MCMC contribution examined ten candidates with one refutable match. These statistics indicate that while the overall framework integration may be new, individual algorithmic components have substantial prior work among the examined papers.

Based on the limited search of twenty-one semantically similar papers, the work appears to offer a novel synthesis of classical search paradigms for diffusion inference, though specific algorithmic contributions show overlap with existing methods. The taxonomy structure suggests this sits in a moderately explored area with clear boundaries from discrete diffusion and evolutionary approaches. The analysis does not cover the full literature landscape, and a broader search might reveal additional related work in adjacent branches or domain-specific applications.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
21
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: inference-time control in diffusion models using classical search algorithms. The field has organized itself around four main branches that reflect different emphases in how search and control are applied. Search-Based Inference-Time Alignment and Control focuses on tree search and Monte Carlo methods to steer generation toward desired rewards or constraints, often employing evolutionary strategies or dynamic search procedures to refine outputs. Inference-Time Scaling and Adaptive Computation explores how to allocate computational budgets more effectively during sampling, investigating adaptive step selection and scaling laws that trade off quality against inference cost. Domain-Specific Applications and Specialized Guidance addresses tailored solutions for particular modalities such as protein design, RNA generation, or audio super-resolution, where domain constraints shape the search process. Finally, Specialized Inference Techniques and Architectures examines novel sampling frameworks, noise trajectory optimization, and architectural modifications that enable more flexible or efficient control mechanisms. Recent work has concentrated on reward-guided generation and the interplay between search depth and sample quality. Classical Search Scaling[0] sits within the tree search cluster for reward-guided generation, alongside methods like Tree Search Guidance[41], Dynamic Search Alignment[17], and Diffusion Tree Sampling[29], all of which explore how classical search algorithms can be adapted to navigate the high-dimensional latent spaces of diffusion models. While Tree Search Steering[21] and Tree Reward Search[12] emphasize Monte Carlo rollouts and value estimation, Classical Search Scaling[0] investigates how traditional search paradigms scale with increased inference compute, a theme that resonates with broader efforts in Inference-Time Scaling such as Scaling Inference Compute[9] and Beyond Denoising Steps[4]. A key open question across these branches is whether the gains from deeper search justify the computational overhead, and how to balance exploration breadth with exploitation of high-reward regions during generation.

Claimed Contributions

General framework orchestrating local and global search for diffusion models

The authors introduce a unified framework for inference-time scaling of diffusion models that combines global search (via breadth-first and depth-first tree search) with local search (via annealed Langevin MCMC). This framework enables efficient exploration of the generative space and refinement of samples beyond the base model's capabilities.

10 retrieved papers
First adaptive DFS algorithm for diffusion inference scaling

The authors propose a depth-first search algorithm with adaptive backtracking for diffusion models. Unlike prior methods with fixed schedules, this DFS approach dynamically allocates compute based on verifier scores, enabling early backtracking and preventing excessive compute on easy instances.

1 retrieved paper
Can Refute
Theoretically grounded local search via annealed Langevin MCMC

The authors develop a local search method based on annealed Langevin MCMC that samples from compositional distributions. They provide theoretical unification showing that training-free guidance with recurrence is equivalent to Langevin MCMC in the continuous limit, enabling principled optimization beyond base model modes.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

General framework orchestrating local and global search for diffusion models

The authors introduce a unified framework for inference-time scaling of diffusion models that combines global search (via breadth-first and depth-first tree search) with local search (via annealed Langevin MCMC). This framework enables efficient exploration of the generative space and refinement of samples beyond the base model's capabilities.

Contribution

First adaptive DFS algorithm for diffusion inference scaling

The authors propose a depth-first search algorithm with adaptive backtracking for diffusion models. Unlike prior methods with fixed schedules, this DFS approach dynamically allocates compute based on verifier scores, enabling early backtracking and preventing excessive compute on easy instances.

Contribution

Theoretically grounded local search via annealed Langevin MCMC

The authors develop a local search method based on annealed Langevin MCMC that samples from compositional distributions. They provide theoretical unification showing that training-free guidance with recurrence is equivalent to Langevin MCMC in the continuous limit, enabling principled optimization beyond base model modes.

Inference-time scaling of diffusion models through classical search | Novelty Validation