Emergent Discrete Controller Modules for Symbolic Planning in Transformers

ICLR 2026 Conference SubmissionAnonymous Authors
Transformerssymbolic planningdiscrete controller moduleslength generalizationalgorithmic reasoning
Abstract:

Transformers struggle with tasks that require symbolic planning loops, variable updates, and conditional branching, especially under length extrapolation. We introduce discrete controller modules that insert a small set of program primitives (ASSIGN, ADD, COMPARE, BRANCH) into Transformer blocks via a Gumbel–Softmax selector over operations and a compact program state of registers, flags, and optional memory. We prove that the augmented model can simulate any bounded-step program by mapping each primitive step to one controller step, and we bound the deviation of relaxed execution from its discrete trace by O(τ+κ1)O(\tau+\kappa^{-1}) (selection temperature τ\tau, comparison sharpness κ\kappa). Empirically, the controller-augmented Transformer achieves strong length generalization on algorithmic benchmarks (Sorting, Sum-of-List, BFS), improving longest-length accuracy by up to 20204040 points over strong baselines, and yields consistent gains on symbolic QA (DROP) and program-synthesis-style tasks (RobustFill) with reduced compositionality drop-off. The learned execution is interpretable: operation traces align with ground truth, register roles are linearly decodable, and targeted knockouts cause localized accuracy losses. The approach adds only \sim5–7% FLOPs and can be applied sparsely (every pp-th layer).

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces discrete controller modules that augment Transformers with program primitives (ASSIGN, ADD, COMPARE, BRANCH) for algorithmic reasoning tasks requiring symbolic planning loops and length extrapolation. It sits within the Transformer-Based Algorithmic Learning leaf of the taxonomy, which contains only two papers total. This is a relatively sparse research direction compared to more crowded areas like Task and Motion Planning (ten papers across four subcategories) or LLM-Based Neuro-Symbolic Planning (four papers). The sparsity suggests that explicit modular controller designs for Transformers remain underexplored, with most work either pursuing end-to-end neural learning or full neuro-symbolic integration rather than discrete programmatic augmentation.

The taxonomy reveals several neighboring approaches that address length generalization differently. Memory-Augmented Neural Architectures (two papers) explore external memory mechanisms without explicit program primitives, while Reinforcement Learning with Symbolic Guidance (three papers) incorporates symbolic planning through policy learning rather than architectural modules. The Programmatic and Code-Based Neuro-Symbolic Approaches branch (three papers) generates executable programs but typically as outputs rather than integrated execution modules. The paper's approach diverges by embedding discrete controllers directly into Transformer blocks, creating a hybrid that maintains differentiability while enforcing programmatic structure—a design choice that distinguishes it from both pure neural architectures and code-generation frameworks.

Among twenty candidates examined across three contributions, none were found to clearly refute the core claims. The discrete controller module design examined ten candidates with zero refutable matches, suggesting limited prior work on this specific architectural pattern. The formal expressivity result for bounded imperative programs examined three candidates without finding overlapping theoretical analysis. The tight integration mechanism examined seven candidates, again with no clear precedents. This pattern indicates that while the broader problem of length generalization in algorithmic reasoning is well-studied, the specific combination of discrete program primitives, Gumbel-Softmax selection, and register-based state management within Transformer blocks appears novel within the examined literature scope.

The analysis covers top-twenty semantic matches and does not constitute an exhaustive survey of all Transformer augmentation methods or program synthesis techniques. The taxonomy structure suggests the paper occupies a relatively unexplored niche between pure neural algorithmic learning and full neuro-symbolic integration, though the limited search scope means potentially relevant work in adjacent areas (e.g., differentiable programming, neural module networks) may not have been captured. The contribution-level statistics consistently show no clear refutations, but this reflects the examined candidate set rather than definitive novelty across all related literature.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
20
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: length generalization in algorithmic reasoning with symbolic planning. This field investigates how systems can learn to solve reasoning and planning problems that extend beyond the training distribution in terms of sequence length or problem complexity. The taxonomy reveals several complementary research directions. Neural-Symbolic Integration for Planning explores hybrid architectures that combine learned representations with symbolic reasoning modules, as seen in works like Inner Monologue[3] and Neuro-Symbolic Skills[7]. Symbolic Planning and Search Methods focus on classical planning techniques and their extensions, including approaches like Beyond A-Star[12] that push traditional search boundaries. Neural Architectures for Algorithmic Reasoning examines how neural models—particularly transformers—can be designed or trained to capture algorithmic structure, with studies such as Algorithmic Generalization Patterns[32] analyzing what enables length generalization. Symbolic Knowledge Representation and Reasoning addresses how to encode domain knowledge and constraints, exemplified by Symbolic Constraints LLMs[24]. Finally, Generalization and Abstraction in Planning investigates hierarchical decomposition and abstraction mechanisms that facilitate transfer across problem scales, including works like Sketch-Plan-Generalize[45]. A central tension across these branches concerns the trade-off between end-to-end neural flexibility and the compositional guarantees of symbolic methods. Many recent efforts blend differentiable modules with discrete planning steps to achieve both data efficiency and systematic generalization. Discrete Controller Modules[0] sits within the Neural Architectures for Algorithmic Reasoning branch, specifically under Transformer-Based Algorithmic Learning, and emphasizes modular controller designs that can compose learned primitives for longer horizons. This contrasts with purely symbolic approaches like Symbolic Generalization Planning[2], which rely on explicit operator definitions, and with more integrated neuro-symbolic frameworks such as Neuro-Symbolic Relaxation[5] that soften discrete structures for gradient-based learning. By focusing on discrete modular components, Discrete Controller Modules[0] aims to preserve interpretability and compositionality while leveraging neural pattern recognition, positioning it as a middle ground between fully learned and fully symbolic paradigms.

Claimed Contributions

Discrete controller modules with program primitives for Transformers

The authors propose augmenting Transformer blocks with learnable discrete controllers that execute symbolic operations (ASSIGN, ADD, COMPARE, BRANCH) on a latent program state consisting of registers, flags, and optional memory. The controller uses Gumbel-Softmax to select operations differentiably while maintaining interpretable, step-wise execution traces.

10 retrieved papers
Formal expressivity result for bounded imperative programs

The authors provide a theoretical guarantee showing that their controller-augmented Transformer can exactly simulate any program in a bounded imperative class with at most K primitive steps and loop bound B, requiring depth O(K+B). They also bound the approximation error under finite temperature and comparison sharpness parameters.

3 retrieved papers
Tight integration of controllers with attention and feed-forward layers

The authors design a method to integrate the discrete controller directly into the Transformer architecture through state-to-token and token-to-state projections, allowing the program state to interact with the token stream via residual connections without requiring external memory modules or post-processing steps.

7 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Discrete controller modules with program primitives for Transformers

The authors propose augmenting Transformer blocks with learnable discrete controllers that execute symbolic operations (ASSIGN, ADD, COMPARE, BRANCH) on a latent program state consisting of registers, flags, and optional memory. The controller uses Gumbel-Softmax to select operations differentiably while maintaining interpretable, step-wise execution traces.

Contribution

Formal expressivity result for bounded imperative programs

The authors provide a theoretical guarantee showing that their controller-augmented Transformer can exactly simulate any program in a bounded imperative class with at most K primitive steps and loop bound B, requiring depth O(K+B). They also bound the approximation error under finite temperature and comparison sharpness parameters.

Contribution

Tight integration of controllers with attention and feed-forward layers

The authors design a method to integrate the discrete controller directly into the Transformer architecture through state-to-token and token-to-state projections, allowing the program state to interact with the token stream via residual connections without requiring external memory modules or post-processing steps.