Learning linear state-space models with sparse system matrices

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

linear state-space modelsexpectation-maximization algorithmsystem identificationstate estimation

Due to tractable analysis and control, linear state-space models (LSSMs) provide a fundamental mathematical tool for time-series data modeling in various disciplines. In particular, many LSSMs have sparse system matrices because interactions among variables are limited or only a few significant relationships exist. However, current learning algorithms for LSSMs lack the ability to learn system matrices with the sparsity constraint due to the similarity transformation. To address this issue, we impose sparsity-promoting priors on system matrices to balance modeling error and model complexity. By taking hidden states of LSSMs as latent variables, we then explore the expectation-maximization (EM) algorithm to derive a maximum a posteriori (MAP) estimate of both hidden states and system matrices from noisy observations. Based on the Global Convergence Theorem, we further demonstrate that the proposed learning algorithm yields a sequence converging to a local maximum or saddle point of the joint posterior distribution. Finally, experimental results on simulation and real-world problems illustrate that the proposed algorithm can preserve the inherent topological structure among variables and significantly improve prediction accuracy over classical learning algorithms.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes an EM-based MAP estimation framework for learning linear state-space models with sparse system matrices by imposing sparsity-promoting priors. It resides in the 'Constrained Optimization for Sparse System Identification' leaf, which contains five papers addressing parameter estimation through regularization and convex constraints. This leaf sits within the broader 'Sparse Parameter Estimation and System Identification' branch, indicating a moderately populated research direction focused on optimization-based approaches to sparse system recovery. The taxonomy reveals this is an established but not overcrowded area, with neighboring leaves covering Bayesian methods, adversarial robustness, and high-dimensional structures.

The paper's leaf neighbors include Bayesian frameworks using MCMC for sparse inference and robust identification under adversarial disturbances, both of which offer alternative perspectives on handling sparsity constraints. The taxonomy structure shows clear boundaries: the paper's optimization-based approach contrasts with purely Bayesian methods in the sibling leaf and differs from state estimation techniques in parallel branches. Related directions include graphical state-space models emphasizing network topology and data-driven discovery methods for nonlinear systems, suggesting the paper occupies a niche balancing classical linear modeling with modern sparsity-inducing techniques. The scope notes clarify that the paper focuses on parameter estimation rather than state estimation or domain-specific applications.

Among thirty candidates examined, the EM algorithm contribution shows substantial prior work, with four of ten candidates providing overlapping methods. The convergence guarantee contribution similarly faces three refutable candidates from ten examined, indicating established theoretical frameworks in this space. The topological structure preservation contribution appears more distinctive, with zero refutable candidates among ten examined. These statistics suggest the core algorithmic and theoretical contributions build upon a well-developed foundation, while the structural preservation aspect may offer more novel insights. The limited search scope means these findings reflect top-semantic-match results rather than exhaustive coverage.

Given the search examined thirty candidates across three contributions, the paper appears to integrate established EM and convergence techniques with potentially novel topological preservation objectives. The taxonomy placement in a five-paper leaf within a fifty-paper field suggests moderate competition in this specific optimization-based sparse identification niche. The analysis captures semantic neighbors but does not cover all citation networks or recent preprints, leaving open questions about incremental versus transformative contributions relative to the full literature landscape.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Learning linear state-space models with sparse system matrices. The field addresses how to identify dynamical systems when the underlying transition or observation matrices contain only a small number of nonzero entries, a structure that arises naturally in many physical, biological, and engineered systems. The taxonomy organizes research into several major branches: Sparse Parameter Estimation and System Identification focuses on algorithms that directly recover sparse system matrices through regularized optimization or Bayesian inference, often employing convex relaxations or greedy methods; State and Input Estimation with Sparsity Constraints deals with filtering and smoothing problems where states or driving inputs are themselves sparse; Sparse Graphical and Structured State-Space Models emphasize network topology and conditional independence structures; Data-Driven Discovery and Nonlinear System Identification includes methods like SINDy that learn governing equations from data; State Space Models with Neural and Deep Learning Components integrate sparsity with modern neural architectures; Applications of Sparse State-Space Models demonstrate domain-specific uses in neuroscience, control, and biology; and Theoretical Foundations and General Frameworks provide sample complexity bounds and convergence guarantees. Representative works such as PySINDy[14] and Sparse Bayesian Estimation[5] illustrate the diversity of algorithmic approaches. A particularly active line of work centers on constrained optimization techniques that balance model fidelity with sparsity-inducing penalties, trading off computational tractability against statistical efficiency. Convex Constrained Learning[1] and Low-Rank Approximation[32] exemplify methods that impose structural constraints during parameter estimation, while Data-Driven Sparse[41] and Sample Complexity Sparse[49] explore the interplay between data requirements and model complexity. Within this landscape, Sparse System Matrices[0] sits squarely in the Constrained Optimization for Sparse System Identification cluster, emphasizing principled optimization frameworks for recovering sparse transition matrices. Compared to neighbors like Convex Constrained Learning[1], which may prioritize convexity and scalability, Sparse System Matrices[0] likely addresses the specific challenges of enforcing exact or approximate sparsity patterns in linear dynamical systems, potentially offering tighter theoretical guarantees or more flexible constraint handling than earlier approaches such as Data-Driven Sparse[41].

Claimed Contributions

EM algorithm for learning LSSMs with sparse system matrices

Can Refute

10 retrieved papers

The authors propose an expectation-maximization algorithm that imposes sparsity-promoting priors (Student's t-distribution) on system matrices to learn linear state-space models with sparse structure. The method alternates between estimating hidden states using the RTS smoother and updating system matrices via block coordinate descent to maximize the joint posterior distribution.

10 retrieved papers

Can Refute

Global convergence guarantee for the proposed algorithm

Can Refute

10 retrieved papers

The authors provide a theoretical convergence analysis demonstrating that their algorithm is guaranteed to converge to a local maximum or saddle point of the posterior distribution, leveraging the Global Convergence Theorem.

10 retrieved papers

Can Refute

Preservation of topological structure in learned system matrices

10 retrieved papers

Unlike classical learning algorithms that only learn system matrices up to a similarity transformation, the proposed algorithm preserves the inherent topological structure among variables by restricting the similarity transformation to generalized permutation matrices through sparsity constraints, making the learned models more interpretable.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Learning linear dynamical systems under convex constraints PDF

Hemant Tyagi, Denis Efimov (2023) • arXiv.org

[32] Sparse system identification by low-rank approximation PDF

Vides, Fredy, Fredy Vides (2021) • arXiv.org

[41] Data-driven sparse system identification PDF

Salar Fattahi, Somayeh Sojoudi, S. Fattahi, S. Sojoudi (2018)

[49] Sample complexity of sparse system identification problem PDF

Salar Fattahi, Somayeh Sojoudi (2018)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

EM algorithm for learning LSSMs with sparse system matrices

[61] A generalized GraphEM for sparse time-varying dynamical systems PDF

Can Refute

[64] Graphical inference in linear-Gaussian state-space models PDF

Can Refute

[65] A state-space approach to sparse dynamic network reconstruction PDF

Can Refute

[66] GraphEM: EM algorithm for blind Kalman filtering under graphical sparsity constraints PDF

Can Refute

[2] Parameter estimation in sparse state-space models PDF

Cannot Refute

[28] Graphit: Iterative Reweighted â1 Algorithm for Sparse Graph Inference in State-Space Models PDF

Cannot Refute

[45] Efficient estimation of compressible state-space models with application to calcium signal deconvolution PDF

Cannot Refute

[62] Data-driven discovery of linear dynamical systems from noisy data PDF

Cannot Refute

[63] Towards Inversion-Free Sparse Bayesian Learning: A Universal Approach PDF

Cannot Refute

[67] Hierarchical MTC User Activity Detection and Channel Estimation With Unknown Spatial Covariance PDF

Cannot Refute

Contribution

Global convergence guarantee for the proposed algorithm

[57] Regularized EM Algorithms: A Unified Framework and Statistical Guarantees PDF

Can Refute

[59] Alternating Bregman projections and convergence of the EM algorithm PDF

Can Refute

[60] Online Inference for Mixture Model of Streaming Graph Signals With Sparse Excitation PDF

Can Refute

[51] Convergence of the expectation-maximization algorithm through discrete-time lyapunov stability theory PDF

Cannot Refute

[52] Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression PDF

Cannot Refute

[53] The Information Bottleneck EM Algorithm PDF

Cannot Refute

[54] Convergence of EM image reconstruction algorithms with Gibbs smoothing PDF

Cannot Refute

[55] The EM algorithm and extensions PDF

Cannot Refute

[56] Convergence results for the EM approach to mixtures of experts architectures PDF

Cannot Refute

[58] Efficient EM optimization exploiting parallel local sampling strategy and Bayesian optimization for microwave applications PDF

Cannot Refute

Contribution

Preservation of topological structure in learned system matrices

[68] Online Graph Topology Learning via Time-Vertex Adaptive Filters: From Theory to Cardiac Fibrillation PDF

Cannot Refute

[69] GSSF: Generalized Structural Sparse Function for Deep Cross-Modal Metric Learning PDF

Cannot Refute

[70] Sparse representation for restoring images by exploiting topological structure of graph of patches PDF

Cannot Refute

[71] On the Role of Sparsity and DAG Constraints for Learning Linear DAGs PDF

Cannot Refute

[72] Hâ distributed control for large systems and its application to coupled inverted pendulum PDF

Cannot Refute

[73] Towards high-precision data modeling of SHM measurements using an improved sparse Bayesian learning scheme with strong generalization ability PDF

Cannot Refute

[74] Subspace learning via Hessian regularized latent representation learning with -norm constraint: unsupervised feature selection PDF

Cannot Refute

[75] An Explainable Probabilistic Model for Health Monitoring of Concrete Dam via Optimized Sparse Bayesian Learning and Sensitivity Analysis PDF

Cannot Refute

[76] Refined convergence and topology learning for decentralized sgd with heterogeneous data PDF

Cannot Refute

[77] Higher Order Transformers With Kronecker-Structured Attention PDF

Cannot Refute

Learning linear state-space models with sparse system matrices

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Learning linear dynamical systems under convex constraints PDF

[32] Sparse system identification by low-rank approximation PDF

[41] Data-driven sparse system identification PDF

[49] Sample complexity of sparse system identification problem PDF

Contribution Analysis

EM algorithm for learning LSSMs with sparse system matrices

[61] A generalized GraphEM for sparse time-varying dynamical systems PDF

[64] Graphical inference in linear-Gaussian state-space models PDF

[65] A state-space approach to sparse dynamic network reconstruction PDF

[66] GraphEM: EM algorithm for blind Kalman filtering under graphical sparsity constraints PDF

[2] Parameter estimation in sparse state-space models PDF

[28] Graphit: Iterative Reweighted â1 Algorithm for Sparse Graph Inference in State-Space Models PDF

[45] Efficient estimation of compressible state-space models with application to calcium signal deconvolution PDF

[62] Data-driven discovery of linear dynamical systems from noisy data PDF

[63] Towards Inversion-Free Sparse Bayesian Learning: A Universal Approach PDF

[67] Hierarchical MTC User Activity Detection and Channel Estimation With Unknown Spatial Covariance PDF

Global convergence guarantee for the proposed algorithm

[57] Regularized EM Algorithms: A Unified Framework and Statistical Guarantees PDF

[59] Alternating Bregman projections and convergence of the EM algorithm PDF

[60] Online Inference for Mixture Model of Streaming Graph Signals With Sparse Excitation PDF

[51] Convergence of the expectation-maximization algorithm through discrete-time lyapunov stability theory PDF

[52] Global Convergence of EM Algorithm for Mixtures of Two Component Linear Regression PDF

[53] The Information Bottleneck EM Algorithm PDF

[54] Convergence of EM image reconstruction algorithms with Gibbs smoothing PDF

[55] The EM algorithm and extensions PDF

[56] Convergence results for the EM approach to mixtures of experts architectures PDF

[58] Efficient EM optimization exploiting parallel local sampling strategy and Bayesian optimization for microwave applications PDF

Preservation of topological structure in learned system matrices

[68] Online Graph Topology Learning via Time-Vertex Adaptive Filters: From Theory to Cardiac Fibrillation PDF

[69] GSSF: Generalized Structural Sparse Function for Deep Cross-Modal Metric Learning PDF

[70] Sparse representation for restoring images by exploiting topological structure of graph of patches PDF

[71] On the Role of Sparsity and DAG Constraints for Learning Linear DAGs PDF

[72] Hâ distributed control for large systems and its application to coupled inverted pendulum PDF

[73] Towards high-precision data modeling of SHM measurements using an improved sparse Bayesian learning scheme with strong generalization ability PDF

[74] Subspace learning via Hessian regularized latent representation learning with -norm constraint: unsupervised feature selection PDF

[75] An Explainable Probabilistic Model for Health Monitoring of Concrete Dam via Optimized Sparse Bayesian Learning and Sensitivity Analysis PDF

[76] Refined convergence and topology learning for decentralized sgd with heterogeneous data PDF

[77] Higher Order Transformers With Kronecker-Structured Attention PDF

Table of Contents

[28] Graphit: Iterative Reweighted â1 Algorithm for Sparse Graph Inference in State-Space Models PDF

[72] Hâ distributed control for large systems and its application to coupled inverted pendulum PDF