Learning linear state-space models with sparse system matrices

ICLR 2026 Conference SubmissionAnonymous Authors
linear state-space modelsexpectation-maximization algorithmsystem identificationstate estimation
Abstract:

Due to tractable analysis and control, linear state-space models (LSSMs) provide a fundamental mathematical tool for time-series data modeling in various disciplines. In particular, many LSSMs have sparse system matrices because interactions among variables are limited or only a few significant relationships exist. However, current learning algorithms for LSSMs lack the ability to learn system matrices with the sparsity constraint due to the similarity transformation. To address this issue, we impose sparsity-promoting priors on system matrices to balance modeling error and model complexity. By taking hidden states of LSSMs as latent variables, we then explore the expectation-maximization (EM) algorithm to derive a maximum a posteriori (MAP) estimate of both hidden states and system matrices from noisy observations. Based on the Global Convergence Theorem, we further demonstrate that the proposed learning algorithm yields a sequence converging to a local maximum or saddle point of the joint posterior distribution. Finally, experimental results on simulation and real-world problems illustrate that the proposed algorithm can preserve the inherent topological structure among variables and significantly improve prediction accuracy over classical learning algorithms.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes an EM-based MAP estimation framework for learning linear state-space models with sparse system matrices by imposing sparsity-promoting priors. It resides in the 'Constrained Optimization for Sparse System Identification' leaf, which contains five papers addressing parameter estimation through regularization and convex constraints. This leaf sits within the broader 'Sparse Parameter Estimation and System Identification' branch, indicating a moderately populated research direction focused on optimization-based approaches to sparse system recovery. The taxonomy reveals this is an established but not overcrowded area, with neighboring leaves covering Bayesian methods, adversarial robustness, and high-dimensional structures.

The paper's leaf neighbors include Bayesian frameworks using MCMC for sparse inference and robust identification under adversarial disturbances, both of which offer alternative perspectives on handling sparsity constraints. The taxonomy structure shows clear boundaries: the paper's optimization-based approach contrasts with purely Bayesian methods in the sibling leaf and differs from state estimation techniques in parallel branches. Related directions include graphical state-space models emphasizing network topology and data-driven discovery methods for nonlinear systems, suggesting the paper occupies a niche balancing classical linear modeling with modern sparsity-inducing techniques. The scope notes clarify that the paper focuses on parameter estimation rather than state estimation or domain-specific applications.

Among thirty candidates examined, the EM algorithm contribution shows substantial prior work, with four of ten candidates providing overlapping methods. The convergence guarantee contribution similarly faces three refutable candidates from ten examined, indicating established theoretical frameworks in this space. The topological structure preservation contribution appears more distinctive, with zero refutable candidates among ten examined. These statistics suggest the core algorithmic and theoretical contributions build upon a well-developed foundation, while the structural preservation aspect may offer more novel insights. The limited search scope means these findings reflect top-semantic-match results rather than exhaustive coverage.

Given the search examined thirty candidates across three contributions, the paper appears to integrate established EM and convergence techniques with potentially novel topological preservation objectives. The taxonomy placement in a five-paper leaf within a fifty-paper field suggests moderate competition in this specific optimization-based sparse identification niche. The analysis captures semantic neighbors but does not cover all citation networks or recent preprints, leaving open questions about incremental versus transformative contributions relative to the full literature landscape.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
7
Refutable Paper

Research Landscape Overview

Core task: Learning linear state-space models with sparse system matrices. The field addresses how to identify dynamical systems when the underlying transition or observation matrices contain only a small number of nonzero entries, a structure that arises naturally in many physical, biological, and engineered systems. The taxonomy organizes research into several major branches: Sparse Parameter Estimation and System Identification focuses on algorithms that directly recover sparse system matrices through regularized optimization or Bayesian inference, often employing convex relaxations or greedy methods; State and Input Estimation with Sparsity Constraints deals with filtering and smoothing problems where states or driving inputs are themselves sparse; Sparse Graphical and Structured State-Space Models emphasize network topology and conditional independence structures; Data-Driven Discovery and Nonlinear System Identification includes methods like SINDy that learn governing equations from data; State Space Models with Neural and Deep Learning Components integrate sparsity with modern neural architectures; Applications of Sparse State-Space Models demonstrate domain-specific uses in neuroscience, control, and biology; and Theoretical Foundations and General Frameworks provide sample complexity bounds and convergence guarantees. Representative works such as PySINDy[14] and Sparse Bayesian Estimation[5] illustrate the diversity of algorithmic approaches. A particularly active line of work centers on constrained optimization techniques that balance model fidelity with sparsity-inducing penalties, trading off computational tractability against statistical efficiency. Convex Constrained Learning[1] and Low-Rank Approximation[32] exemplify methods that impose structural constraints during parameter estimation, while Data-Driven Sparse[41] and Sample Complexity Sparse[49] explore the interplay between data requirements and model complexity. Within this landscape, Sparse System Matrices[0] sits squarely in the Constrained Optimization for Sparse System Identification cluster, emphasizing principled optimization frameworks for recovering sparse transition matrices. Compared to neighbors like Convex Constrained Learning[1], which may prioritize convexity and scalability, Sparse System Matrices[0] likely addresses the specific challenges of enforcing exact or approximate sparsity patterns in linear dynamical systems, potentially offering tighter theoretical guarantees or more flexible constraint handling than earlier approaches such as Data-Driven Sparse[41].

Claimed Contributions

EM algorithm for learning LSSMs with sparse system matrices

The authors propose an expectation-maximization algorithm that imposes sparsity-promoting priors (Student's t-distribution) on system matrices to learn linear state-space models with sparse structure. The method alternates between estimating hidden states using the RTS smoother and updating system matrices via block coordinate descent to maximize the joint posterior distribution.

10 retrieved papers
Can Refute
Global convergence guarantee for the proposed algorithm

The authors provide a theoretical convergence analysis demonstrating that their algorithm is guaranteed to converge to a local maximum or saddle point of the posterior distribution, leveraging the Global Convergence Theorem.

10 retrieved papers
Can Refute
Preservation of topological structure in learned system matrices

Unlike classical learning algorithms that only learn system matrices up to a similarity transformation, the proposed algorithm preserves the inherent topological structure among variables by restricting the similarity transformation to generalized permutation matrices through sparsity constraints, making the learned models more interpretable.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

EM algorithm for learning LSSMs with sparse system matrices

The authors propose an expectation-maximization algorithm that imposes sparsity-promoting priors (Student's t-distribution) on system matrices to learn linear state-space models with sparse structure. The method alternates between estimating hidden states using the RTS smoother and updating system matrices via block coordinate descent to maximize the joint posterior distribution.

Contribution

Global convergence guarantee for the proposed algorithm

The authors provide a theoretical convergence analysis demonstrating that their algorithm is guaranteed to converge to a local maximum or saddle point of the posterior distribution, leveraging the Global Convergence Theorem.

Contribution

Preservation of topological structure in learned system matrices

Unlike classical learning algorithms that only learn system matrices up to a similarity transformation, the proposed algorithm preserves the inherent topological structure among variables by restricting the similarity transformation to generalized permutation matrices through sparsity constraints, making the learned models more interpretable.