Implicit Models: Expressive Power Scales with Test-Time Compute

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

Implicit modelsDeep equilibrium modelsExpressive power

Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a fixed point. This architecture realizes an infinite-depth, weight-tied network that trains with constant memory, significantly reducing memory needs for the same level of performance compared to explicit models. While it is empirically known that these compact models can often match or even exceed the accuracy of larger explicit networks by allocating more test-time compute, the underlying reasons are not yet well understood.

We study this gap through a non-parametric analysis of expressive power. We provide a strict mathematical characterization, showing that a simple and regular implicit operator can, through iteration, progressively express more complex mappings. We prove that for a broad class of implicit models, this process allows the model's expressive power to grow with test-time compute, ultimately matching a much richer function class. The theory is validated across four domains: imaging, scientific computing, operations research, and LLM reasoning, demonstrating that as test-time iterations increase, the complexity of the learned mapping rises, while the solution quality simultaneously improves and stabilizes.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper provides a mathematical characterization of how implicit models' expressive power scales with test-time compute through iterative fixed-point solving. It resides in the 'Expressive Power Theory and Fixed-Point Dynamics' leaf, which contains only two papers total, indicating a relatively sparse research direction within the broader taxonomy. This leaf sits under 'Implicit Model Architectures and Theoretical Foundations', distinguishing it from application-focused or architecture-design branches. The sibling paper in this leaf shares the theoretical focus on fixed-point dynamics, suggesting this is an emerging area with limited prior theoretical work.

The taxonomy reveals neighboring branches addressing recurrent architectures, implicit neural representations for continuous functions, and domain-specific applications in scientific computing and 3D reconstruction. The paper's theoretical lens contrasts with these more implementation-oriented directions. Adjacent branches on test-time compute scaling in language models and diffusion models explore similar iterative refinement concepts but apply them to specific model classes rather than providing general expressive power theory. The taxonomy's scope notes clarify that this leaf excludes empirical validation studies and architectural instantiations, positioning the work as foundational theory rather than applied methodology.

Among thirty candidates examined across three contributions, none were identified as clearly refuting the paper's claims. The first contribution on mathematical characterization examined ten candidates with zero refutable matches, as did the second contribution on locally Lipschitz mappings as expressive boundaries. The validation framework contribution similarly found no overlapping prior work among ten examined candidates. This suggests that within the limited search scope, the theoretical characterization and the specific framing around locally Lipschitz function classes appear relatively unexplored. However, the small candidate pool means the analysis cannot rule out relevant work outside the top-thirty semantic matches.

Based on the limited literature search covering thirty candidates, the work appears to occupy a sparsely populated theoretical niche within implicit model research. The taxonomy structure confirms that foundational expressive power theory for implicit models is less developed than application-driven or architecture-focused directions. The absence of refutable candidates across all contributions suggests novelty within the examined scope, though exhaustive coverage of related theoretical work in dynamical systems or approximation theory remains uncertain.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: expressive power scaling with test-time compute in implicit models. The field encompasses a diverse set of approaches unified by the idea that models can leverage additional computation at inference time to improve performance or expressiveness. The taxonomy reveals several major branches: one focuses on the theoretical foundations and architectural designs of implicit models themselves, examining fixed-point dynamics and expressive power theory; another explores test-time compute scaling specifically in language model reasoning, where iterative refinement and chain-of-thought mechanisms play central roles; a third branch investigates diffusion models with implicit formulations and accelerated inference strategies such as DDIM[1]; additional branches address domain-specific applications ranging from neural fields to physical simulations, as well as meta-learning and continual learning paradigms that exploit implicit mechanisms. Works like Latent Reasoning Recurrent[3] and Sample Scrutinize Scale[4] illustrate how test-time iteration can enhance reasoning, while Learning at Test Time[5] and In-context Bayesian Inference[8] demonstrate adaptive inference strategies that blur the line between training and deployment. A particularly active line of work examines the interplay between architectural depth, iterative solvers, and the computational budget available at test time. Some studies emphasize robustness and reliability of inference procedures, as seen in Inference Compute Robustness[2], while others like SoftCoT++[9] and Inner Thinking Transformer[10] explore how internal reasoning steps can be learned and scaled. Implicit Models Test Compute[0] sits squarely within the theoretical foundations branch, closely aligned with Fixed Point Diffusion[16], both investigating how fixed-point iterations and implicit layers scale expressiveness as test-time compute increases. Compared to neighboring works that focus on domain-specific neural representations or accelerated sampling in diffusion models, Implicit Models Test Compute[0] emphasizes the fundamental capacity gains achievable through iterative refinement in implicit architectures, offering a more general lens on expressive power rather than optimizing a particular application domain.

Claimed Contributions

Mathematical characterization of implicit models' expressive power scaling with test-time compute

10 retrieved papers

The authors establish that regular implicit operators can represent any locally Lipschitz function through iterative fixed-point computation. They prove that expressive power grows with test-time iterations, allowing simple operators to realize complex mappings without adding parameters.

10 retrieved papers

Identification of locally Lipschitz mappings as the expressive boundary for implicit models

10 retrieved papers

The authors define regular implicit operators and prove bidirectional results: any locally Lipschitz function can be represented as a fixed point of a regular operator (Theorem 2.4), and conversely, any fixed point of a regular operator is locally Lipschitz (Theorem 2.5).

10 retrieved papers

Validation framework demonstrating emergent expressive power across four application domains

10 retrieved papers

The authors provide empirical validation across diverse tasks (image reconstruction, Navier-Stokes equations, linear programming, and language model reasoning) showing that empirical Lipschitz constants grow with iterations while solution quality improves, confirming the theoretical predictions.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[16] Fixed point diffusion models PDF

Xingjian Bai, Luke Melas-Kyriazi (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Mathematical characterization of implicit models' expressive power scaling with test-time compute

[3] Scaling up test-time compute with latent reasoning: A recurrent depth approach PDF

Cannot Refute

[7] Adaptive Cyclic Diffusion for Inference Scaling PDF

Cannot Refute

[44] Implicit Language Models are RNNs: Balancing Parallelization and Expressivity PDF

Cannot Refute

[45] Seek in the dark: Reasoning via test-time instance-level policy gradient in latent space PDF

Cannot Refute

[46] Deep equilibrium architectures for inverse problems in imaging PDF

Cannot Refute

[47] Single Image Deraining Based on Denoising Diffusion Implicit Models PDF

Cannot Refute

[48] Ladir: Latent diffusion enhances llms for text reasoning PDF

Cannot Refute

[49] Scaling Offline RL via Efficient and Expressive Shortcut Models PDF

Cannot Refute

[50] Generation as search operator for test-time scaling of diffusion-based combinatorial optimization PDF

Cannot Refute

[51] Noise Conditional Variational Score Distillation PDF

Cannot Refute

Contribution

Identification of locally Lipschitz mappings as the expressive boundary for implicit models

[24] On implicit function theorem for locally Lipschitz equations PDF

Cannot Refute

[25] Implicit functions and solution mappings PDF

Cannot Refute

[26] One method for investigating the solvability of boundary value problems for an implicit differential equation PDF

Cannot Refute

[27] Segment tracing using local lipschitz bounds PDF

Cannot Refute

[28] An implicit-function theorem for a class of nonsmooth functions PDF

Cannot Refute

[29] Stability and convergence analysis of unconditionally original energy dissipative implicit-explicit Runge--Kutta methods for the phase field crystal models without Lipschitz assumptions PDF

Cannot Refute

[30] Convergence theorems of common fixed points for a finite family of Lipschitz pseudocontractions in Banach spaces PDF

Cannot Refute

[31] A further result on an implicit function theorem for locally Lipschitz functions PDF

Cannot Refute

[32] Implicit function theorems for generalized equations PDF

Cannot Refute

[33] On a global implicit function theorem for locally Lipschitz maps via non-smooth critical point theory PDF

Cannot Refute

Contribution

Validation framework demonstrating emergent expressive power across four application domains

[34] Scaling Inference Time Compute for Diffusion Models PDF

Cannot Refute

[35] Inference-time scaling for diffusion models beyond scaling denoising steps PDF

Cannot Refute

[36] ð-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation PDF

Cannot Refute

[37] From reflection to perfection: Scaling inference-time optimization for text-to-image diffusion models via reflection tuning PDF

Cannot Refute

[38] One-minute video generation with test-time training PDF

Cannot Refute

[39] Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models PDF

Cannot Refute

[40] Neural 3d video synthesis from multi-view video PDF

Cannot Refute

[41] An adaptive compute approach to optimize inference efficiency in large language models PDF

Cannot Refute

[42] Test-time training done right PDF

Cannot Refute

[43] Amix-1: A pathway to test-time scalable protein foundation model PDF

Cannot Refute

Implicit Models: Expressive Power Scales with Test-Time Compute

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[16] Fixed point diffusion models PDF

Contribution Analysis

Mathematical characterization of implicit models' expressive power scaling with test-time compute

[3] Scaling up test-time compute with latent reasoning: A recurrent depth approach PDF

[7] Adaptive Cyclic Diffusion for Inference Scaling PDF

[44] Implicit Language Models are RNNs: Balancing Parallelization and Expressivity PDF

[45] Seek in the dark: Reasoning via test-time instance-level policy gradient in latent space PDF

[46] Deep equilibrium architectures for inverse problems in imaging PDF

[47] Single Image Deraining Based on Denoising Diffusion Implicit Models PDF

[48] Ladir: Latent diffusion enhances llms for text reasoning PDF

[49] Scaling Offline RL via Efficient and Expressive Shortcut Models PDF

[50] Generation as search operator for test-time scaling of diffusion-based combinatorial optimization PDF

[51] Noise Conditional Variational Score Distillation PDF

Identification of locally Lipschitz mappings as the expressive boundary for implicit models

[24] On implicit function theorem for locally Lipschitz equations PDF

[25] Implicit functions and solution mappings PDF

[26] One method for investigating the solvability of boundary value problems for an implicit differential equation PDF

[27] Segment tracing using local lipschitz bounds PDF

[28] An implicit-function theorem for a class of nonsmooth functions PDF

[29] Stability and convergence analysis of unconditionally original energy dissipative implicit-explicit Runge--Kutta methods for the phase field crystal models without Lipschitz assumptions PDF

[30] Convergence theorems of common fixed points for a finite family of Lipschitz pseudocontractions in Banach spaces PDF

[31] A further result on an implicit function theorem for locally Lipschitz functions PDF

[32] Implicit function theorems for generalized equations PDF

[33] On a global implicit function theorem for locally Lipschitz maps via non-smooth critical point theory PDF

Validation framework demonstrating emergent expressive power across four application domains

[34] Scaling Inference Time Compute for Diffusion Models PDF

[35] Inference-time scaling for diffusion models beyond scaling denoising steps PDF

[36] ð-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation PDF

[37] From reflection to perfection: Scaling inference-time optimization for text-to-image diffusion models via reflection tuning PDF

[38] One-minute video generation with test-time training PDF

[39] Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models PDF

[40] Neural 3d video synthesis from multi-view video PDF

[41] An adaptive compute approach to optimize inference efficiency in large language models PDF

[42] Test-time training done right PDF

[43] Amix-1: A pathway to test-time scalable protein foundation model PDF

Table of Contents

[36] ð-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation PDF