Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Large Language ModelsDiffusion Language ModelTraining-Free

Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical application of DLLMs is hindered by a critical architectural constraint: the need for a statically predefined generation length. This static length allocation leads to a problematic trade-off: insufficient lengths cripple performance on complex tasks, while excessive lengths incur significant computational overhead and sometimes result in performance degradation. While the inference framework is rigid, we observe that the model itself possesses internal signals that correlate with the optimal response length for a given task. To bridge this gap, we leverage these latent signals and introduce DAEDAL, a novel training-free denoising strategy that enables Dynamic Adaptive Length Expansion for Diffusion Large Language Models. DAEDAL operates in two phases: 1) Before the denoising process, DAEDAL starts from a short initial length and iteratively expands it to a coarse task-appropriate length, guided by a sequence completion metric. 2) During the denoising process, DAEDAL dynamically intervenes by pinpointing and expanding insufficient generation regions through mask token insertion, ensuring the final output is fully developed. Extensive experiments on DLLMs demonstrate that DAEDAL achieves performance comparable, and in some cases superior, to meticulously tuned fixed-length baselines, while simultaneously enhancing computational efficiency by achieving a higher effective token ratio. By resolving the static length constraint, DAEDAL unlocks new potential for DLLMs, bridging a critical gap with their Autoregressive counterparts and paving the way for more efficient and capable generation.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces DAEDAL, a training-free method for dynamically adjusting generation length in diffusion language models during inference. It resides in the 'Dynamic Adaptive Length Inference Strategies' leaf, which contains only two papers total (including this one). This represents a sparse, emerging research direction within the broader taxonomy of nine papers across diffusion language modeling. The sibling paper in this leaf addresses similar adaptive length challenges, suggesting this is a nascent area with limited prior exploration compared to more established branches like masked diffusion architectures or hybrid autoregressive-diffusion systems.

The taxonomy reveals that DAEDAL sits within the 'Inference-Time Optimization and Acceleration' branch, which also includes speculative decoding approaches that use diffusion models as drafters for autoregressive targets. Neighboring branches focus on core architectural innovations (masked diffusion, context extension) and hybrid systems that combine autoregressive and diffusion paradigms through block-based generation. DAEDAL diverges from these by maintaining pure diffusion inference while addressing the static length constraint through internal model signals, rather than architectural redesign or training-time modifications that characterize the hybrid approaches.

Among thirty candidates examined, the core DAEDAL contribution shows two refutable candidates from ten examined, indicating some overlap with prior adaptive length work. However, the two sub-contributions—initial length adjustment via sequence completion metrics and iterative mask insertion during denoising—each examined ten candidates with zero refutations, suggesting these specific mechanisms may be more novel. The limited search scope means these statistics reflect top-thirty semantic matches rather than exhaustive field coverage, so additional related work may exist beyond this analysis window.

Given the sparse taxonomy leaf and limited prior work in training-free adaptive length strategies, DAEDAL appears to address an underexplored problem space within diffusion language modeling. The analysis covers top-thirty semantic candidates plus citation expansion, providing reasonable confidence about immediate neighbors but not comprehensive field coverage. The specific combination of pre-denoising length expansion and intra-denoising mask insertion represents a distinct approach within the emerging adaptive inference direction.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Dynamic adaptive length expansion for diffusion language models. The field of diffusion-based language modeling has evolved into several distinct research directions. The taxonomy reveals three main branches: foundational work on core diffusion architectures and training methods (including masked diffusion approaches like Simple and Effective Masked[1]), hybrid systems that combine autoregressive and diffusion mechanisms (such as Block Diffusion[2] and From Next-Token to Next-Block[7]), and inference-time optimization strategies that focus on acceleration and adaptive generation. These branches reflect a progression from establishing basic diffusion frameworks to exploring hybrid designs that leverage strengths of multiple paradigms, and finally to refining how these models generate text efficiently at deployment time. Recent work has concentrated on making diffusion language models more practical through inference-time innovations. A handful of studies explore dynamic strategies that adjust generation parameters on-the-fly, contrasting with fixed-length approaches that dominate earlier diffusion methods. Beyond Fixed[0] sits within this emerging cluster of adaptive inference techniques, alongside Efficient Self-Evaluation for Diffusion[9], both addressing how to dynamically determine appropriate generation lengths rather than committing to predetermined sequence boundaries. This contrasts with acceleration-focused works like Diffuspec[5] and Ultrallada[3], which prioritize speed through speculative decoding or distillation but typically maintain fixed architectural assumptions. The key tension across these inference-oriented branches involves balancing generation quality, computational efficiency, and the flexibility to adapt to varying task requirements—a challenge that adaptive length strategies directly engage with by allowing models to determine their own output boundaries during generation.

Claimed Contributions

DAEDAL: Dynamic Adaptive Length Expansion for Diffusion LLMs

Can Refute

10 retrieved papers

The authors propose DAEDAL, a training-free two-stage inference strategy that allows Diffusion Large Language Models to dynamically adjust generation length instead of relying on a statically predefined length. This addresses a fundamental architectural constraint of DLLMs.

10 retrieved papers

Can Refute

Initial Length Adjustment using sequence completion metric

10 retrieved papers

The first stage of DAEDAL uses the model's confidence in predicting End-of-Sequence tokens as an internal signal to iteratively expand from a short initial length to a task-appropriate length before denoising begins.

10 retrieved papers

Iterative Mask Insertion for dynamic expansion during denoising

10 retrieved papers

The second stage of DAEDAL identifies regions with exceptionally low prediction confidence during denoising and dynamically inserts additional mask tokens at these expansion points, providing more space for complex reasoning where needed.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[9] Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration PDF

L Zhong, L Wu, W Wang, Y Xi, C Jing, J Zhang, H Chen (0)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

DAEDAL: Dynamic Adaptive Length Expansion for Diffusion LLMs

[2] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models PDF

Can Refute

[4] Sequential Diffusion Language Models PDF

Can Refute

[1] Simple and Effective Masked Diffusion Language Models PDF

Cannot Refute

[3] Ultrallada: Scaling the context length to 128k for diffusion large language models PDF

Cannot Refute

[10] Ssd-lm: Semi-autoregressive simplex-based diffusion language model for text generation and modular control PDF

Cannot Refute

[11] CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models PDF

Cannot Refute

[12] Length-aware motion synthesis via latent diffusion PDF

Cannot Refute

[13] Diffusion llm with native variable generation lengths: Let lead the way PDF

Cannot Refute

[14] On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond PDF

Cannot Refute

[15] LaViDa: A Large Diffusion Model for Vision-Language Understanding PDF

Cannot Refute

Contribution

Initial Length Adjustment using sequence completion metric

[16] Activity Sequence Modelling with Deep Generative Models PDF

Cannot Refute

[17] DiffER: categorical diffusion ensembles for single-step chemical retrosynthesis PDF

Cannot Refute

[18] Power-aware deep learning model serving with {Î¼-Serve} PDF

Cannot Refute

[19] Time-series generative adversarial networks for flood forecasting PDF

Cannot Refute

[20] : Increasing GPU Utilization during Generative Inference for Higher Throughput PDF

Cannot Refute

[21] Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction PDF

Cannot Refute

[22] DiffER: Categorical Diffusion for Chemical Retrosynthesis PDF

Cannot Refute

[23] Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning PDF

Cannot Refute

[24] Planning-Aware Code Infilling via Horizon-Length Prediction PDF

Cannot Refute

[25] Dynamic Real-Time Production Forecasting Model for Complex Subsurface Flow Systems with Variable Length Input Sequences PDF

Cannot Refute

Contribution

Iterative Mask Insertion for dynamic expansion during denoising

[1] Simple and Effective Masked Diffusion Language Models PDF

Cannot Refute

[4] Sequential Diffusion Language Models PDF

Cannot Refute

[26] MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models PDF

Cannot Refute

[27] Mask-predict: Parallel decoding of conditional masked language models PDF

Cannot Refute

[28] Salience-based adaptive masking: revisiting token dynamics for enhanced pre-training PDF

Cannot Refute

[29] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning PDF

Cannot Refute

[30] Thinking inside the mask: In-place prompting in diffusion llms PDF

Cannot Refute

[31] Soft-Masked Diffusion Language Models PDF

Cannot Refute

[32] Learning Better Masking for Better Language Model Pre-training PDF

Cannot Refute

[33] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies PDF

Cannot Refute

Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[9] Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration PDF

Contribution Analysis

DAEDAL: Dynamic Adaptive Length Expansion for Diffusion LLMs

[2] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models PDF

[4] Sequential Diffusion Language Models PDF

[1] Simple and Effective Masked Diffusion Language Models PDF

[3] Ultrallada: Scaling the context length to 128k for diffusion large language models PDF

[10] Ssd-lm: Semi-autoregressive simplex-based diffusion language model for text generation and modular control PDF

[11] CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models PDF

[12] Length-aware motion synthesis via latent diffusion PDF

[13] Diffusion llm with native variable generation lengths: Let lead the way PDF

[14] On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond PDF

[15] LaViDa: A Large Diffusion Model for Vision-Language Understanding PDF

Initial Length Adjustment using sequence completion metric

[16] Activity Sequence Modelling with Deep Generative Models PDF

[17] DiffER: categorical diffusion ensembles for single-step chemical retrosynthesis PDF

[18] Power-aware deep learning model serving with {Î¼-Serve} PDF

[19] Time-series generative adversarial networks for flood forecasting PDF

[20] : Increasing GPU Utilization during Generative Inference for Higher Throughput PDF

[21] Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction PDF

[22] DiffER: Categorical Diffusion for Chemical Retrosynthesis PDF

[23] Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning PDF

[24] Planning-Aware Code Infilling via Horizon-Length Prediction PDF

[25] Dynamic Real-Time Production Forecasting Model for Complex Subsurface Flow Systems with Variable Length Input Sequences PDF

Iterative Mask Insertion for dynamic expansion during denoising

[1] Simple and Effective Masked Diffusion Language Models PDF

[4] Sequential Diffusion Language Models PDF

[26] MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models PDF

[27] Mask-predict: Parallel decoding of conditional masked language models PDF

[28] Salience-based adaptive masking: revisiting token dynamics for enhanced pre-training PDF

[29] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning PDF

[30] Thinking inside the mask: In-place prompting in diffusion llms PDF

[31] Soft-Masked Diffusion Language Models PDF

[32] Learning Better Masking for Better Language Model Pre-training PDF

[33] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies PDF

Table of Contents