The Expressive Limits of Diagonal SSMs for State-Tracking

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

state-space modelSSMLRNNlinear RNNexpressivitycomplexdynamical systemstate-trackingsemigroupgroupautomataKrohn-Rhodes

State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable. However, the theoretical understanding of their expressive power remains limited. In this work, we study the expressivity of input-Dependent Complex-valued Diagonal (DCD) State-Space Models (SSMs) on sequential state-tracking tasks for abstract groups. It is easy to show that a single DCD SSM layer with a universal decoder can track any Abelian group at finite precision by decomposing it into a product of cyclic groups. We show that this is tight by proving that such a model cannot track any non-Abelian group at finite precision. We further establish the expressivity of multi-layer DCD SSMs. We show that a $k$ -layer DCD SSM tracks a group if and only if that group has a subnormal series of length at most $k$ , with Abelian factor groups. Empirically, while multi-layer models are theoretically expressive enough for solvable non-Abelian groups, we find they often fail to learn such solutions in practice, highlighting a gap between expressivity and learnability.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper characterizes the expressivity of diagonal complex-valued state-space models (DCD SSMs) for group state-tracking tasks, proving that single-layer models can track Abelian groups but not non-Abelian groups at finite precision. It resides in the 'Diagonal SSM Expressivity Bounds' leaf under 'Theoretical Expressivity Analysis', sharing this leaf with one sibling paper. This represents a relatively sparse research direction within the taxonomy, which contains only seven total papers across six leaf nodes, suggesting the paper addresses a focused theoretical question in an emerging subfield.

The taxonomy reveals that neighboring work diverges into architectural enhancements (dense parameterizations, structured sparsity) and domain-specific applications (geometric SSMs, temporal graphs). The sibling paper in the same leaf likely explores related diagonal expressivity questions, while the adjacent 'Eigenvalue-Based Expressivity Mechanisms' leaf examines spectral properties as a complementary theoretical lens. The paper's focus on group-theoretic characterizations distinguishes it from architectural modifications that relax diagonal constraints, positioning it as foundational theory rather than applied methodology.

Among fourteen candidates examined, the first contribution (single-layer expressivity) showed no refutable prior work across four candidates, suggesting novelty in the Abelian/non-Abelian dichotomy result. The second contribution (multi-layer subnormal series characterization) had zero candidates examined, indicating limited direct precedent. The third contribution (learnability gap) examined ten candidates with three appearing to provide overlapping empirical observations, suggesting this aspect has more substantial prior exploration within the limited search scope. The theoretical contributions appear more distinctive than the empirical learnability findings.

Based on the limited search of fourteen semantically similar papers, the theoretical characterizations appear relatively novel within the examined scope, particularly the group-theoretic framework for multi-layer models. However, the learnability gap observation aligns with existing work on expressivity-trainability mismatches. The analysis does not cover exhaustive literature review or broader SSM theory beyond the top-K semantic matches and their citations.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Expressivity of diagonal state-space models on group state-tracking tasks. The field examines how state-space models (SSMs), particularly those with diagonal structure, can represent and track sequential dependencies. The taxonomy reveals three main branches: Theoretical Expressivity Analysis investigates fundamental capacity limits and mathematical bounds of diagonal SSMs, often through formal analysis of what these architectures can and cannot represent; Architectural Enhancements for State-Tracking explores modifications and extensions that improve tracking capabilities, such as selective mechanisms or alternative parameterizations; and Domain-Specific SSM Applications adapts these models to particular problem settings like temporal graphs or geometric dynamics. Works like Selective SSM Foundations[1] illustrate how architectural choices affect expressivity, while studies such as GeoDynamics[3] and SSM Temporal Graphs[4] demonstrate domain-tailored approaches. The interplay between these branches reflects a tension between maintaining computational efficiency through diagonal constraints and achieving sufficient representational power for complex sequential tasks. Several active lines of work highlight key trade-offs in this landscape. One thread examines fixed-point characterizations and interpolation properties, as seen in Fixed-Point RNNs Diagonal[2] and Fixed-Point RNNs Interpolating[5], which probe how recurrent dynamics relate to SSM expressivity. Another explores the role of eigenvalue structure, with Negative Eigenvalues State-Tracking[6] investigating how spectral properties influence tracking performance, and Sparse Transition Matrices[7] considering sparsity as an alternative constraint. Diagonal SSM Limits[0] sits squarely within the Theoretical Expressivity Analysis branch, focusing on rigorous bounds for diagonal architectures on group state-tracking tasks. Compared to neighboring work like Selective SSM Foundations[1], which emphasizes architectural mechanisms to enhance capacity, Diagonal SSM Limits[0] takes a more foundational stance by characterizing inherent limitations, helping clarify when diagonal constraints become bottlenecks versus when they suffice for particular tracking problems.

Claimed Contributions

Expressivity characterization of single-layer diagonal SSMs for group state-tracking

4 retrieved papers

The authors prove that a single-layer input-dependent complex-valued diagonal SSM can track a group at finite precision if and only if that group is Abelian. This establishes a fundamental limitation of single-layer diagonal SSMs on non-commutative group operations.

4 retrieved papers

Expressivity characterization of multi-layer diagonal SSMs via subnormal series

0 retrieved papers

The authors establish that k-layer diagonal SSMs can track exactly those groups admitting a subnormal series with Abelian factors of length at most k. This precisely identifies the expressive capacity of multi-layer diagonal SSMs within the class of solvable groups.

0 retrieved papers

Demonstration of learnability gap between expressivity and trainability

Can Refute

10 retrieved papers

The authors empirically show that even when multi-layer diagonal SSMs are theoretically capable of tracking non-Abelian groups, standard gradient-based training often fails to discover generalizable solutions. This reveals a practical limitation beyond theoretical expressivity.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Theoretical foundations of deep selective state-space models PDF

Terry Lyons, Nicola Muca Cirone, Antonio Orvieto, Cristopher Salvi, Benjamin Walker (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Expressivity characterization of single-layer diagonal SSMs for group state-tracking

[2] Fixed-point rnns: From diagonal to dense in a few iterations PDF

Cannot Refute

[5] Fixed-Point RNNs: Interpolating from Diagonal to Dense PDF

Cannot Refute

[7] Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models PDF

Cannot Refute

[8] State space grids PDF

Cannot Refute

Contribution

Expressivity characterization of multi-layer diagonal SSMs via subnormal series

Contribution

Demonstration of learnability gap between expressivity and trainability

[10] Capacity and Trainability in Recurrent Neural Networks PDF

Can Refute

[12] Saturation in Recurrent Neural Networks: Expressivity, Learnability, and Generalization. PDF

Can Refute

[14] The gap between theory and practice in function approximation with deep neural networks PDF

Can Refute

[9] Theoretical limitations of self-attention in neural sequence models PDF

Cannot Refute

[11] A little depth goes a long way: The expressive power of log-depth transformers PDF

Cannot Refute

[13] Computational Expressivity of Neural Language Models PDF

Cannot Refute

[15] In-context language learning: Architectures and algorithms PDF

Cannot Refute

[16] TallFormer: Temporal Action Localization with a Long-Memory Transformer PDF

Cannot Refute

[17] The expressibility of polynomial based attention scheme PDF

Cannot Refute

[18] A quantum neural network for sequential data analysis in machine learning PDF

Cannot Refute

The Expressive Limits of Diagonal SSMs for State-Tracking

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Theoretical foundations of deep selective state-space models PDF

Contribution Analysis

Expressivity characterization of single-layer diagonal SSMs for group state-tracking

[2] Fixed-point rnns: From diagonal to dense in a few iterations PDF

[5] Fixed-Point RNNs: Interpolating from Diagonal to Dense PDF

[7] Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models PDF

[8] State space grids PDF

Expressivity characterization of multi-layer diagonal SSMs via subnormal series

Demonstration of learnability gap between expressivity and trainability

[10] Capacity and Trainability in Recurrent Neural Networks PDF

[12] Saturation in Recurrent Neural Networks: Expressivity, Learnability, and Generalization. PDF

[14] The gap between theory and practice in function approximation with deep neural networks PDF

[9] Theoretical limitations of self-attention in neural sequence models PDF

[11] A little depth goes a long way: The expressive power of log-depth transformers PDF

[13] Computational Expressivity of Neural Language Models PDF

[15] In-context language learning: Architectures and algorithms PDF

[16] TallFormer: Temporal Action Localization with a Long-Memory Transformer PDF

[17] The expressibility of polynomial based attention scheme PDF

[18] A quantum neural network for sequential data analysis in machine learning PDF

Table of Contents