The Serial Scaling Hypothesis

ICLR 2026 Conference SubmissionAnonymous Authors
Scaling LawComputation complexity
Abstract:

While machine learning has advanced through massive parallelization, we identify a critical blind spot: some problems are fundamentally sequential. These "inherently serial" problems—from mathematical reasoning to physical simulations to sequential decision-making—require sequentially dependent computational steps that cannot be efficiently parallelized. We formalize this distinction in complexity theory, and demonstrate that current parallel-centric architectures face fundamental limitations on such tasks. Then, we show for first time that diffusion models despite their sequential nature are incapable of solving inherently serial problems. We argue that recognizing the serial nature of computation holds profound implications on machine learning, model design, and hardware development.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper formalizes the distinction between inherently serial and parallelizable problems in machine learning, proposing the Serial Scaling Hypothesis and analyzing diffusion models' limitations on sequential tasks. It resides in the 'Inherently Serial Problem Characterization' leaf, which contains only two papers total within the 'Theoretical Foundations and Complexity Analysis' branch. This sparse positioning suggests the paper addresses a relatively underexplored theoretical direction within a 50-paper taxonomy spanning 21 leaf nodes. The leaf's scope explicitly focuses on formalizing problems requiring sequential dependencies that resist parallelization, distinguishing it from implementation-focused branches.

The taxonomy reveals neighboring work in 'Convergence Bounds and Optimization Limits' and 'Expressiveness and Representational Capacity' within the same theoretical branch, alongside a substantial 'Parallelization Techniques for Sequential Tasks' branch containing methods like Lookahead Decoding and Parallel Thinking Reinforcement. The paper's theoretical stance contrasts with these parallelization attempts: while neighboring leaves examine optimization complexity or architectural expressiveness, this work argues certain problems fundamentally resist the parallelization strategies explored in adjacent branches. The taxonomy's structure highlights this tension between theoretical impossibility results and practical parallelization efforts.

Among 30 candidates examined across three contributions, none yielded clear refutations. The Serial Scaling Hypothesis examined 10 candidates with zero refutable matches, as did the complexity-theoretic characterization of diffusion models and the theoretical framework for inherently serial problems. This absence of overlapping prior work within the limited search scope suggests these specific formalizations—particularly the SSH and the diffusion model analysis—may represent novel theoretical angles. However, the search scale is modest: 30 candidates from semantic search cannot comprehensively cover all relevant complexity theory or diffusion model literature.

The analysis indicates apparent novelty within the examined scope, particularly for the SSH and diffusion model characterization, though the 30-candidate search cannot rule out relevant prior work in broader complexity theory or generative modeling literature. The sparse taxonomy leaf and zero refutations across contributions suggest the paper occupies a relatively unexplored theoretical niche, but definitive novelty assessment would require examining complexity theory venues and diffusion model theory beyond the top-30 semantic matches analyzed here.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: understanding the computational limitations of parallel versus serial computation in machine learning. The field structure reflects a multi-layered investigation spanning theoretical foundations, practical implementation strategies, and application domains. At the highest level, the taxonomy distinguishes between theoretical complexity analysis that characterizes which problems resist parallelization, techniques for parallelizing inherently sequential tasks, distributed training frameworks that coordinate large-scale optimization, hardware acceleration strategies that exploit specialized architectures, application-specific studies across diverse domains, and neural architecture design considerations that influence training dynamics. Works such as Pathways[11] and Demystifying Distributed Deep Learning[21] illustrate how distributed frameworks address scalability, while hardware-focused studies like Capacitor-Reconfigured Analog CIM[9] and Matrix Multiplication CUDA[31] explore acceleration at the silicon and system levels. Meanwhile, theoretical branches examine fundamental limits, asking which computational patterns can be efficiently parallelized and which remain bottlenecked by serial dependencies. A particularly active line of inquiry centers on reasoning and sequential decision-making, where the tension between parallel and serial computation becomes acute. Chain of Thought[3] and related works such as Lookahead Decoding[5] and Parallel Thinking Reinforcement[4] explore whether multi-step reasoning can be accelerated through parallelism or whether it fundamentally requires sequential token generation. The Serial Scaling Hypothesis[0] sits squarely within this debate, examining the inherent serial nature of certain problem classes and proposing that some tasks may resist parallelization due to intrinsic computational structure rather than algorithmic limitations. This contrasts with Chain of Thought[3], which investigates how sequential reasoning emerges in language models, and with Lookahead Decoding[5], which attempts to mitigate serial bottlenecks through speculative execution. Together, these works highlight an open question: to what extent can we overcome serial dependencies through clever parallelization, and where do fundamental complexity barriers force us to accept sequential computation as unavoidable?

Claimed Contributions

Serial Scaling Hypothesis (SSH)

The authors propose the Serial Scaling Hypothesis, which states that for many important ML problems such as reasoning, decision making, and modeling dynamic systems, increasing parallel computation alone is insufficient and progress requires scaling the amount of serial computation. This hypothesis is grounded in complexity theory and connects theoretical limitations to practical ML challenges.

10 retrieved papers
Complexity-theoretic characterization of diffusion models

The authors prove that diffusion models with a TC0 backbone remain in TC0 even with infinitely many sampling steps, demonstrating that despite their stepwise structure, diffusion models are incapable of solving inherently serial problems. This is the first formal characterization showing diffusion models do not provide scalable serial computation.

10 retrieved papers
Theoretical framework for inherently serial problems in ML

The authors formalize the distinction between parallel and inherently serial problems using complexity theory (specifically the TC class), identify real-world ML problems that are inherently serial (cellular automata, many-body mechanics, sequential decision-making, mathematical QA), and prove that modern architectures like MLPs, Transformers, SSMs, and diffusion models with TC0 backbones cannot solve general instances of these problems.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Serial Scaling Hypothesis (SSH)

The authors propose the Serial Scaling Hypothesis, which states that for many important ML problems such as reasoning, decision making, and modeling dynamic systems, increasing parallel computation alone is insufficient and progress requires scaling the amount of serial computation. This hypothesis is grounded in complexity theory and connects theoretical limitations to practical ML challenges.

Contribution

Complexity-theoretic characterization of diffusion models

The authors prove that diffusion models with a TC0 backbone remain in TC0 even with infinitely many sampling steps, demonstrating that despite their stepwise structure, diffusion models are incapable of solving inherently serial problems. This is the first formal characterization showing diffusion models do not provide scalable serial computation.

Contribution

Theoretical framework for inherently serial problems in ML

The authors formalize the distinction between parallel and inherently serial problems using complexity theory (specifically the TC class), identify real-world ML problems that are inherently serial (cellular automata, many-body mechanics, sequential decision-making, mathematical QA), and prove that modern architectures like MLPs, Transformers, SSMs, and diffusion models with TC0 backbones cannot solve general instances of these problems.

The Serial Scaling Hypothesis | Novelty Validation