Temporal superposition and feature geometry of RNNs under memory demands

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.5 Download Report PDF

RNNssuperpositionrepresentational geometryfeaturescapacitymemory demands

Understanding how populations of neurons represent information is a central challenge across machine learning and neuroscience. Recent work in both fields has begun to characterize the representational geometry and functionality underlying complex distributed activity. For example, artificial neural networks trained on data with more features than neurons compress data by representing features non-orthogonally in so-called superposition. However, the effect of time (or memory), an additional capacity-constraining pressure, on underlying representational geometry in recurrent models is not well understood. Here, we study how memory demands affect representational geometry in recurrent neural networks (RNNs), introducing the concept of temporal superposition. We develop a theoretical framework in RNNs with linear recurrence trained on a delayed serial recall task to better understand how properties of the data, task demands, and network dimensionality lead to different representational strategies, and show that these insights generalize to nonlinear RNNs. Through this, we identify an effectively linear, dense regime and a sparse regime where RNNs utilize an interference-free space, characterized by a phase transition in the angular distribution of features and decrease in spectral radius. Finally, we analyze the interaction of spatial and temporal superposition to observe how RNNs mediate different representational tradeoffs. Overall, our work offers a mechanistic, geometric explanation of representational strategies RNNs learn, how they depend on capacity and task demands, and why.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces temporal superposition as a framework for understanding how recurrent neural networks represent multiple features under memory constraints, focusing on delayed serial recall tasks. It resides in the 'Geometric Properties of Memory Representations' leaf, which contains six papers examining manifold structure and spatial organization of memory codes. This leaf sits within the broader 'Representational Geometry and Memory Organization' branch, indicating a moderately populated research direction. The sibling papers address related geometric questions—balanced memory structures, naturalistic object geometry, and manifold connectivity—suggesting the paper enters an active but not overcrowded subfield where geometric analysis of memory is an established concern.

The taxonomy reveals neighboring research directions that contextualize this work. The adjacent 'Memory Capacity and Information Storage' leaf (six papers) focuses on theoretical bounds rather than geometric structure, while 'Learning Dynamics and Representational Development' (six papers across two leaves) examines how representations evolve during training rather than their static properties. The 'Task-Specific Dynamics' branch includes a 'Working Memory Tasks' leaf (five papers) studying similar cognitive paradigms but emphasizing task performance over geometric principles. The paper's focus on geometric organization under temporal constraints bridges these areas, connecting capacity theory with representational structure in a way that distinguishes it from purely capacity-focused or purely task-driven analyses.

Among seventeen candidates examined, three contributions show evidence of prior overlap. The temporal superposition concept (ten candidates examined, one refutable) appears to have some precedent in the limited search scope, though nine candidates did not clearly refute it. The theoretical framework with loss decomposition (five candidates, one refutable) and the identification of interference-free regimes (two candidates, one refutable) each face one potentially overlapping prior work among the small candidate pools examined. These statistics suggest that while the core ideas have some grounding in existing literature, the specific formulation and integration may offer incremental advances. The limited search scope—seventeen total candidates—means these assessments reflect top semantic matches rather than exhaustive coverage.

Based on the constrained literature search, the work appears to synthesize existing geometric and capacity concerns into a unified temporal framework. The taxonomy position indicates a moderately active research area with clear boundaries separating geometric analysis from learning dynamics and architectural design. The contribution-level statistics suggest partial novelty: each major claim encounters at least one potentially overlapping candidate among the limited pool examined, but substantial portions of the candidate sets do not clearly refute the contributions. This pattern is consistent with incremental theoretical refinement rather than a foundational shift, though the restricted search scope limits definitive conclusions about the work's broader originality.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: representational geometry in recurrent neural networks under memory constraints. This field examines how RNNs organize and structure internal representations when faced with limited memory resources, spanning questions about geometric properties of neural codes, learning dynamics that shape these representations, architectural innovations for memory management, task-specific adaptations, and underlying theoretical principles. The taxonomy reflects five major branches: Representational Geometry and Memory Organization explores how memory states are geometrically arranged and what structural properties emerge (e.g., Naturalistic Object Geometry[2], Balanced Memory Geometry[23]); Learning Dynamics and Representational Development investigates how training shapes these geometries over time (e.g., Learning Dynamics Geometry[3]); Architecture Design and Memory Mechanisms focuses on novel gating structures and memory augmentation strategies (e.g., Memory Gated Recurrent[47], Hebbian Memory Augmented[25]); Task-Specific Dynamics examines how different cognitive demands—navigation, sequence processing, working memory—drive distinct representational solutions (e.g., Prefrontal Working Memory[4], Unified Working Memory[7]); and Theoretical Foundations provides capacity analyses and computational principles (e.g., Matrix Representation Capacity[22], Capacity and Trainability[15]). Several active lines of work reveal key trade-offs and open questions. One cluster investigates how networks balance memory capacity with geometric structure: some studies emphasize controlling solution degeneracy and maintaining balanced representations (Solution Degeneracy Control[1], Memory Geometry Control[5]), while others explore how oscillatory dynamics or phase-based codes can expand representational capacity (Oscillatory Representational Geometry[19]). Another thread examines the interplay between task demands and emergent geometry, asking whether memory constraints force networks into low-dimensional manifolds or enable richer, task-relevant structures (Task Relevant Manifolds[12], Activity Manifolds Connectivity[24]). Temporal Superposition RNNs[0] sits within the Geometric Properties of Memory Representations cluster, closely aligned with works like Balanced Memory Geometry[23] that study how networks organize overlapping temporal information in constrained state spaces. Compared to Memory Geometry Control[5], which focuses on explicit geometric constraints during training, Temporal Superposition RNNs[0] emphasizes the natural emergence of superposed codes under memory pressure, offering a complementary perspective on how recurrent architectures handle multiple simultaneous memory demands.

Claimed Contributions

Concept of temporal superposition in RNNs

Can Refute

10 retrieved papers

The authors introduce temporal superposition as a novel form of representational compression in recurrent neural networks that arises from memory demands. Unlike spatial superposition (compressing more input features than neurons), temporal superposition occurs when features must be maintained over time longer than the hidden state dimensionality allows, forcing the network to represent features non-orthogonally across temporal positions.

10 retrieved papers

Can Refute

Theoretical framework with loss decomposition

Can Refute

5 retrieved papers

The authors derive an analytical expression for the loss on a k-delay task that decomposes into four interpretable terms: task benefit, mean correction, projection interference cost, and composition interference. This decomposition explains the geometric strategies employed by RNNs and how data properties and network dimensionality interact with memory demands.

5 retrieved papers

Can Refute

Identification of interference-free space and phase transition

Can Refute

2 retrieved papers

The authors identify that RNNs with nonlinear readouts can exploit an interference-free space (the half-space opposite the readout direction) to pack intermediate feature directions without projection interference. They characterize a phase transition between dense and sparse regimes marked by changes in angular distribution of features and spectral radius, with nonlinear RNNs implementing sharp forgetting by fully exploiting this space.

2 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Measuring and controlling solution degeneracy across task-trained recurrent neural networks PDF

Huang Ann, Martinelli, Flavio, Rajan, Kanaka (2025)

[2] Geometry of naturalistic object representations in recurrent neural network models of working memory PDF

Pouya Bashivan, Takuya Ito, Xiaoxuan Lei (2024) • Neural Information Processing Systems

[7] Geometry of neural computation unifies working memory and planning PDF

Daniel B. Ehrlich, John D. Murray (2022)

[23] Geometry and dynamics of representations in a precisely balanced memory network related to olfactory cortex. PDF

Claire Meissner-Bernard, Friedemann Zenke, Rainer W Friedrich (2025) • eLife

[24] Recurrent neural network models for working memory of continuous variables: activity manifolds, connectivity patterns, and dynamic codes PDF

Christopher J. Cueva, Ardalan, Adel, Adel Ardalan, Tsodyks Misha, Misha Tsodyks, Qian Ning, Ning Qian (2021)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Concept of temporal superposition in RNNs

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

Can Refute

[57] A temporal convolutional recurrent autoencoder based framework for compressing time series data PDF

Cannot Refute

[58] Comparative study of state-based neural networks for virtual analog audio effects modeling PDF

Cannot Refute

[59] HT-STNet: a hierarchical Tucker decomposition and spatio-temporal LSTM network for accurate and efficient shared mobility demand forecasting on sparse data PDF

Cannot Refute

[60] Temporal superimposed crossover module for effective continuous sign language PDF

Cannot Refute

[61] Recurrent neural networks for edge intelligence: a survey PDF

Cannot Refute

[62] Recurrent neural networks with explicit representation of dynamic latent variables can mimic behavioral patterns in a physical inference task PDF

Cannot Refute

[63] An extended echo state network using Volterra filtering and principal component analysis PDF

Cannot Refute

[64] Bbs-rnn: Block-based Structure Compression With Admm For Rnn On Temporal Sequence Applications. PDF

Cannot Refute

[65] Fostering Event-Predictive Encodings in Recurrent Neural Networks PDF

Cannot Refute

Contribution

Theoretical framework with loss decomposition

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

Can Refute

[51] PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning PDF

Cannot Refute

[52] Resource-Efficient Acoustic Full-Waveform Inversion via Dual-Branch Physics-Informed RNN with Scale Decomposition PDF

Cannot Refute

[53] Predicting Wave Dynamics using Deep Learning with Multistep Integration Inspired Attention and Physics-Based Loss Decomposition PDF

Cannot Refute

[54] Two-shot learning of continuous interpolation using a conceptor-aided recurrent autoencoder PDF

Cannot Refute

Contribution

Identification of interference-free space and phase transition

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

Can Refute

[56] Predicting change points in multivariate time series data PDF

Cannot Refute

Temporal superposition and feature geometry of RNNs under memory demands

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Measuring and controlling solution degeneracy across task-trained recurrent neural networks PDF

[2] Geometry of naturalistic object representations in recurrent neural network models of working memory PDF

[7] Geometry of neural computation unifies working memory and planning PDF

[23] Geometry and dynamics of representations in a precisely balanced memory network related to olfactory cortex. PDF

[24] Recurrent neural network models for working memory of continuous variables: activity manifolds, connectivity patterns, and dynamic codes PDF

Contribution Analysis

Concept of temporal superposition in RNNs

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

[57] A temporal convolutional recurrent autoencoder based framework for compressing time series data PDF

[58] Comparative study of state-based neural networks for virtual analog audio effects modeling PDF

[59] HT-STNet: a hierarchical Tucker decomposition and spatio-temporal LSTM network for accurate and efficient shared mobility demand forecasting on sparse data PDF

[60] Temporal superimposed crossover module for effective continuous sign language PDF

[61] Recurrent neural networks for edge intelligence: a survey PDF

[62] Recurrent neural networks with explicit representation of dynamic latent variables can mimic behavioral patterns in a physical inference task PDF

[63] An extended echo state network using Volterra filtering and principal component analysis PDF

[64] Bbs-rnn: Block-based Structure Compression With Admm For Rnn On Temporal Sequence Applications. PDF

[65] Fostering Event-Predictive Encodings in Recurrent Neural Networks PDF

Theoretical framework with loss decomposition

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

[51] PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning PDF

[52] Resource-Efficient Acoustic Full-Waveform Inversion via Dual-Branch Physics-Informed RNN with Scale Decomposition PDF

[53] Predicting Wave Dynamics using Deep Learning with Multistep Integration Inspired Attention and Physics-Based Loss Decomposition PDF

[54] Two-shot learning of continuous interpolation using a conceptor-aided recurrent autoencoder PDF

Identification of interference-free space and phase transition

[55] THE COMPUTATIONAL ROLE OF COMPLEX REPRESENTATIONS IN RNNS PDF

[56] Predicting change points in multivariate time series data PDF

Table of Contents