Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

ICLR 2026 Conference SubmissionAnonymous Authors
chain of continuous thoughttraining dynamicsreasoningsuperposition
Abstract:

Previous work shows that the chain of continuous thought (continuous CoT) improves the reasoning capability of large language models (LLMs) by enabling implicit parallel thinking, and a subsequent work provided theoretical insight by showing that a two-layer transformer equipped with continuous CoT can efficiently solve directed graph reachability by maintaining a superposition of multiple reasoning traces in the continuous thought. However, it remains unclear how the superposition mechanism is naturally learned from gradient-based training methods. To fill this gap, we theoretically analyze the training dynamics of a simplified two-layer transformer on the directed graph reachability problem to unveil how the superposition mechanism emerges during training in two training stages -- (i) a thought-generation stage that autoregressively expands the continuous thought, and (ii) a prediction stage that converts the thought into the final answer. Our analysis reveals that during training using continuous thought, the index-matching logit, an important quantity which reflects the strength of the model's local search ability, will first increase and then remain bounded under mild assumptions. The bounded index-matching logit effectively balances exploration and exploitation during the reasoning process: the model will exploit local problem structures to identify plausible search traces, and assign comparable weights to multiple such traces to explore when it is uncertain about which solution is correct, which results in superposition. Our experimental results tracking the growth of logits further validate our theory.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper contributes a theoretical analysis of how continuous chain-of-thought mechanisms emerge during gradient-based training in two-layer transformers solving directed graph reachability. It resides in the 'Training Dynamics and Convergence Analysis' leaf under 'Theoretical Foundations of CoT Mechanisms,' alongside three sibling papers examining gradient dynamics and convergence properties. This leaf represents a moderately populated research direction within the broader taxonomy of 50 papers across 36 topics, indicating focused but not overcrowded attention to training dynamics questions in continuous CoT.

The taxonomy reveals neighboring theoretical branches including 'Expressivity and Computational Power' (four papers proving what transformers can solve with CoT) and 'Superposition and Parallel Reasoning Theory' (one paper on maintaining multiple traces). The paper bridges these areas by explaining how superposition—previously shown to enable parallel reasoning—actually emerges through training. Nearby practical branches like 'Continuous CoT Architectures and Training' (three papers on model implementations) and 'Latent-Variable CoT Training' (one paper on unsupervised optimization) address related but distinct questions about architecture design and training objectives rather than gradient dynamics.

Among 21 candidates examined across three contributions, no clear refutations emerged. The core contribution on training dynamics analyzed 10 candidates with none providing overlapping prior work; the bounded index-matching logit behavior examined 1 candidate without refutation; and the superposition emergence explanation reviewed 10 candidates, again finding no direct overlap. This limited search scope—focused on top semantic matches and citations—suggests the specific combination of continuous CoT, training dynamics, and superposition emergence may occupy relatively unexplored theoretical territory, though the analysis cannot claim exhaustive coverage of all relevant gradient dynamics literature.

Based on examination of 21 semantically related papers, the work appears to address a gap between expressivity proofs and empirical continuous CoT implementations by analyzing how training naturally discovers superposition mechanisms. The bounded search scope means potentially relevant work in broader optimization theory or neural tangent kernel analyses may exist outside the examined candidates. The taxonomy positioning and sibling paper analysis suggest this represents a natural theoretical extension within an active but not saturated research direction.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
21
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: training dynamics of chain of continuous thought in transformers. The field has evolved from early discrete prompting methods like Chain of Thought Prompting[1] toward a richer landscape that spans multiple paradigms. At the top level, the taxonomy distinguishes Continuous Latent Reasoning Paradigms—where models learn implicit reasoning steps in hidden representations—from Discrete CoT Prompting and Inference Methods that rely on explicit token sequences. Theoretical Foundations of CoT Mechanisms investigate the expressive power and convergence properties underlying these approaches, while CoT Training and Optimization Methods address how to effectively learn reasoning behaviors. Architectural Innovations for Reasoning explore modifications such as looped or recurrent structures (e.g., Looped Transformers[4]), and World Models and Planning-Based Reasoning connect transformers to decision-making frameworks like Reasoning as Planning[3]. Meanwhile, Transformer Learning Dynamics and Mechanisms examine gradient flow, feature evolution, and emergent behaviors during training, and Specialized Transformer Applications adapt these ideas to domains ranging from vision to reinforcement learning. Within this landscape, a particularly active line of work focuses on understanding how transformers internalize multi-step reasoning during training. Superposition Training Dynamics[0] sits squarely in the Theoretical Foundations branch under Training Dynamics and Convergence Analysis, examining how reasoning emerges through superposed representations over the course of optimization. This contrasts with neighboring studies like Nonlinear Transformers CoT[28], which explores architectural nonlinearities to enhance reasoning capacity, and Kinetics of Reasoning[49], which applies statistical physics perspectives to characterize the evolution of reasoning states. Other closely related efforts include Multi Step Gradient Descent[37], which models iterative refinement processes, and works on continuous latent reasoning such as Continuous Latent Reasoning[2] and Scaling Latent Reasoning[33], which emphasize learning implicit thought chains without discrete tokens. Together, these studies reveal ongoing tensions between discrete versus continuous representations, the role of architectural depth versus recurrence, and the interplay between training objectives and emergent reasoning capabilities.

Claimed Contributions

Theoretical analysis of training dynamics for continuous chain-of-thought

The authors provide a theoretical analysis of how gradient-based training naturally leads to the superposition mechanism in continuous chain-of-thought models. They analyze two training stages: thought generation and prediction, revealing how the model learns to maintain multiple reasoning traces in parallel.

10 retrieved papers
Discovery of bounded index-matching logit behavior

The authors discover that the index-matching logit, which quantifies local search capability, grows initially but remains bounded during training with continuous CoT. This bounded behavior contrasts with unbounded logit growth in discrete settings and enables effective exploration-exploitation balance.

1 retrieved paper
Explanation of superposition emergence through bounded logits

The authors explain how bounded index-matching logits lead to superposition by balancing exploration and exploitation. When logits remain bounded, the model assigns comparable weights to multiple plausible reasoning paths rather than over-committing to a single path, naturally producing the superposition mechanism.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Theoretical analysis of training dynamics for continuous chain-of-thought

The authors provide a theoretical analysis of how gradient-based training naturally leads to the superposition mechanism in continuous chain-of-thought models. They analyze two training stages: thought generation and prediction, revealing how the model learns to maintain multiple reasoning traces in parallel.

Contribution

Discovery of bounded index-matching logit behavior

The authors discover that the index-matching logit, which quantifies local search capability, grows initially but remains bounded during training with continuous CoT. This bounded behavior contrasts with unbounded logit growth in discrete settings and enables effective exploration-exploitation balance.

Contribution

Explanation of superposition emergence through bounded logits

The authors explain how bounded index-matching logits lead to superposition by balancing exploration and exploitation. When logits remain bounded, the model assigns comparable weights to multiple plausible reasoning paths rather than over-committing to a single path, naturally producing the superposition mechanism.