Temporal superposition and feature geometry of RNNs under memory demands

ICLR 2026 Conference SubmissionAnonymous Authors
RNNssuperpositionrepresentational geometryfeaturescapacitymemory demands
Abstract:

Understanding how populations of neurons represent information is a central challenge across machine learning and neuroscience. Recent work in both fields has begun to characterize the representational geometry and functionality underlying complex distributed activity. For example, artificial neural networks trained on data with more features than neurons compress data by representing features non-orthogonally in so-called superposition. However, the effect of time (or memory), an additional capacity-constraining pressure, on underlying representational geometry in recurrent models is not well understood. Here, we study how memory demands affect representational geometry in recurrent neural networks (RNNs), introducing the concept of temporal superposition. We develop a theoretical framework in RNNs with linear recurrence trained on a delayed serial recall task to better understand how properties of the data, task demands, and network dimensionality lead to different representational strategies, and show that these insights generalize to nonlinear RNNs. Through this, we identify an effectively linear, dense regime and a sparse regime where RNNs utilize an interference-free space, characterized by a phase transition in the angular distribution of features and decrease in spectral radius. Finally, we analyze the interaction of spatial and temporal superposition to observe how RNNs mediate different representational tradeoffs. Overall, our work offers a mechanistic, geometric explanation of representational strategies RNNs learn, how they depend on capacity and task demands, and why.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces temporal superposition as a framework for understanding how recurrent neural networks represent multiple features under memory constraints, focusing on delayed serial recall tasks. It resides in the 'Geometric Properties of Memory Representations' leaf, which contains six papers examining manifold structure and spatial organization of memory codes. This leaf sits within the broader 'Representational Geometry and Memory Organization' branch, indicating a moderately populated research direction. The sibling papers address related geometric questions—balanced memory structures, naturalistic object geometry, and manifold connectivity—suggesting the paper enters an active but not overcrowded subfield where geometric analysis of memory is an established concern.

The taxonomy reveals neighboring research directions that contextualize this work. The adjacent 'Memory Capacity and Information Storage' leaf (six papers) focuses on theoretical bounds rather than geometric structure, while 'Learning Dynamics and Representational Development' (six papers across two leaves) examines how representations evolve during training rather than their static properties. The 'Task-Specific Dynamics' branch includes a 'Working Memory Tasks' leaf (five papers) studying similar cognitive paradigms but emphasizing task performance over geometric principles. The paper's focus on geometric organization under temporal constraints bridges these areas, connecting capacity theory with representational structure in a way that distinguishes it from purely capacity-focused or purely task-driven analyses.

Among seventeen candidates examined, three contributions show evidence of prior overlap. The temporal superposition concept (ten candidates examined, one refutable) appears to have some precedent in the limited search scope, though nine candidates did not clearly refute it. The theoretical framework with loss decomposition (five candidates, one refutable) and the identification of interference-free regimes (two candidates, one refutable) each face one potentially overlapping prior work among the small candidate pools examined. These statistics suggest that while the core ideas have some grounding in existing literature, the specific formulation and integration may offer incremental advances. The limited search scope—seventeen total candidates—means these assessments reflect top semantic matches rather than exhaustive coverage.

Based on the constrained literature search, the work appears to synthesize existing geometric and capacity concerns into a unified temporal framework. The taxonomy position indicates a moderately active research area with clear boundaries separating geometric analysis from learning dynamics and architectural design. The contribution-level statistics suggest partial novelty: each major claim encounters at least one potentially overlapping candidate among the limited pool examined, but substantial portions of the candidate sets do not clearly refute the contributions. This pattern is consistent with incremental theoretical refinement rather than a foundational shift, though the restricted search scope limits definitive conclusions about the work's broader originality.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
17
Contribution Candidate Papers Compared
3
Refutable Paper

Research Landscape Overview

Core task: representational geometry in recurrent neural networks under memory constraints. This field examines how RNNs organize and structure internal representations when faced with limited memory resources, spanning questions about geometric properties of neural codes, learning dynamics that shape these representations, architectural innovations for memory management, task-specific adaptations, and underlying theoretical principles. The taxonomy reflects five major branches: Representational Geometry and Memory Organization explores how memory states are geometrically arranged and what structural properties emerge (e.g., Naturalistic Object Geometry[2], Balanced Memory Geometry[23]); Learning Dynamics and Representational Development investigates how training shapes these geometries over time (e.g., Learning Dynamics Geometry[3]); Architecture Design and Memory Mechanisms focuses on novel gating structures and memory augmentation strategies (e.g., Memory Gated Recurrent[47], Hebbian Memory Augmented[25]); Task-Specific Dynamics examines how different cognitive demands—navigation, sequence processing, working memory—drive distinct representational solutions (e.g., Prefrontal Working Memory[4], Unified Working Memory[7]); and Theoretical Foundations provides capacity analyses and computational principles (e.g., Matrix Representation Capacity[22], Capacity and Trainability[15]). Several active lines of work reveal key trade-offs and open questions. One cluster investigates how networks balance memory capacity with geometric structure: some studies emphasize controlling solution degeneracy and maintaining balanced representations (Solution Degeneracy Control[1], Memory Geometry Control[5]), while others explore how oscillatory dynamics or phase-based codes can expand representational capacity (Oscillatory Representational Geometry[19]). Another thread examines the interplay between task demands and emergent geometry, asking whether memory constraints force networks into low-dimensional manifolds or enable richer, task-relevant structures (Task Relevant Manifolds[12], Activity Manifolds Connectivity[24]). Temporal Superposition RNNs[0] sits within the Geometric Properties of Memory Representations cluster, closely aligned with works like Balanced Memory Geometry[23] that study how networks organize overlapping temporal information in constrained state spaces. Compared to Memory Geometry Control[5], which focuses on explicit geometric constraints during training, Temporal Superposition RNNs[0] emphasizes the natural emergence of superposed codes under memory pressure, offering a complementary perspective on how recurrent architectures handle multiple simultaneous memory demands.

Claimed Contributions

Concept of temporal superposition in RNNs

The authors introduce temporal superposition as a novel form of representational compression in recurrent neural networks that arises from memory demands. Unlike spatial superposition (compressing more input features than neurons), temporal superposition occurs when features must be maintained over time longer than the hidden state dimensionality allows, forcing the network to represent features non-orthogonally across temporal positions.

10 retrieved papers
Can Refute
Theoretical framework with loss decomposition

The authors derive an analytical expression for the loss on a k-delay task that decomposes into four interpretable terms: task benefit, mean correction, projection interference cost, and composition interference. This decomposition explains the geometric strategies employed by RNNs and how data properties and network dimensionality interact with memory demands.

5 retrieved papers
Can Refute
Identification of interference-free space and phase transition

The authors identify that RNNs with nonlinear readouts can exploit an interference-free space (the half-space opposite the readout direction) to pack intermediate feature directions without projection interference. They characterize a phase transition between dense and sparse regimes marked by changes in angular distribution of features and spectral radius, with nonlinear RNNs implementing sharp forgetting by fully exploiting this space.

2 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Concept of temporal superposition in RNNs

The authors introduce temporal superposition as a novel form of representational compression in recurrent neural networks that arises from memory demands. Unlike spatial superposition (compressing more input features than neurons), temporal superposition occurs when features must be maintained over time longer than the hidden state dimensionality allows, forcing the network to represent features non-orthogonally across temporal positions.

Contribution

Theoretical framework with loss decomposition

The authors derive an analytical expression for the loss on a k-delay task that decomposes into four interpretable terms: task benefit, mean correction, projection interference cost, and composition interference. This decomposition explains the geometric strategies employed by RNNs and how data properties and network dimensionality interact with memory demands.

Contribution

Identification of interference-free space and phase transition

The authors identify that RNNs with nonlinear readouts can exploit an interference-free space (the half-space opposite the readout direction) to pack intermediate feature directions without projection interference. They characterize a phase transition between dense and sparse regimes marked by changes in angular distribution of features and spectral radius, with nonlinear RNNs implementing sharp forgetting by fully exploiting this space.

Temporal superposition and feature geometry of RNNs under memory demands | Novelty Validation