A Recovery Guarantee for Sparse Neural Networks

ICLR 2026 Conference SubmissionAnonymous Authors
compressed sensingneural networksmodel pruningsparse weight recovery
Abstract:

We prove the first guarantees of sparse recovery for ReLU neural networks, where the sparse network weights constitute the signal to be recovered. Specifically, we study structural properties of the sparse network weights for two-layer, scalar-output networks under which a simple iterative hard thresholding algorithm recovers these weights exactly, using memory that grows linearly in the number of nonzero weights. We validate this theoretical result with simple experiments on recovery of sparse planted MLPs, MNIST classification, and implicit neural representations. Experimentally, we find performance that is competitive with, and often exceeds, a high-performing but memory-inefficient baseline based on iterative magnitude pruning.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper establishes the first provable guarantees for recovering sparse weights in two-layer ReLU networks using iterative hard thresholding. It resides in the 'Theoretical Recovery Guarantees' leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy of 50 papers. This leaf focuses specifically on provable algorithms for exact or approximate weight recovery, distinguishing it from empirical pruning methods and training-time sparsification techniques that dominate other branches of the field.

The taxonomy reveals that neighboring work clusters around compressed sensing with generative models, sparse signal recovery applications, and training-induced sparsity methods. The paper's theoretical recovery focus diverges from the more crowded pruning and compression branches, which emphasize post-training interventions without recovery guarantees. Its sibling papers in the same leaf address one-bit compressed sensing variants and learning guarantees for shallow networks, suggesting the paper extends classical compressed sensing ideas to ReLU network weight recovery rather than exploring activation sparsity or architectural design.

Among 30 candidates examined, the contribution on structural properties enabling IHT recovery shows one refutable candidate, while the other two contributions—first recovery guarantee and memory-efficient IHT algorithm—show no clear refutations across 10 candidates each. This limited search scope suggests the core recovery guarantee appears relatively novel within the examined literature, though the structural conditions may overlap with existing compressed sensing or shallow network learning theory. The memory efficiency claim faces no direct prior work among the candidates reviewed.

Based on the top-30 semantic matches and citation expansion, the work appears to occupy a sparsely populated niche at the intersection of compressed sensing theory and neural network weight recovery. The analysis does not cover exhaustive literature on general sparse optimization or broader pruning methods, which may contain relevant but less directly comparable prior work. The taxonomy structure confirms this sits in a less crowded theoretical corner compared to empirical pruning or training-time regularization branches.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: sparse recovery for ReLU neural network weights. The field encompasses a diverse set of approaches to understanding and exploiting sparsity in neural networks with ReLU activations. At the highest level, the taxonomy divides into branches addressing sparse weight recovery and reconstruction (often with theoretical guarantees), training-induced sparsity and regularization methods, pruning and compression techniques for model efficiency, activation sparsity exploitation during inference, specialized sparse network architectures, theoretical analysis of generalization and expressiveness, optimization algorithms that encourage or preserve sparsity, and applications in specialized domains. Some branches, such as pruning and compression, focus on post-training or training-time interventions to reduce model size (e.g., Extreme Pruning Tricks[15]), while others like sparse weight recovery emphasize the mathematical problem of reconstructing network parameters from limited observations (e.g., Robust One-bit Recovery[21], Improved One-bit Recovery[35]). Training-induced sparsity methods (e.g., Adam Implicit Sparsity[25], Hidden Synergy L1[12]) explore how optimization dynamics naturally lead to sparse solutions, whereas activation sparsity exploitation (e.g., Activation Sparsity Inference[7], SparseInfer Training-free[46]) leverages the zero outputs of ReLU units to accelerate inference. A particularly active line of work centers on theoretical recovery guarantees, examining when and how sparse ReLU network weights can be provably reconstructed from data or measurements. This includes studies on one-bit compressed sensing variants (Robust One-bit Recovery[21], Improved One-bit Recovery[35]) and learning guarantees for shallow networks (Learning ReLU Model[40]). Sparse Neural Recovery[0] sits squarely within this theoretical recovery branch, contributing rigorous analysis of conditions under which sparse weight vectors can be identified. Its emphasis on provable reconstruction contrasts with more heuristic pruning approaches (Extreme Pruning Tricks[15]) and complements work on implicit regularization during training (Adam Implicit Sparsity[25]). Compared to nearby theoretical studies like Robust One-bit Recovery[21] and Learning ReLU Model[40], Sparse Neural Recovery[0] appears to focus on extending recovery guarantees to more general measurement models or network configurations, bridging classical compressed sensing ideas with the unique structure imposed by ReLU nonlinearities.

Claimed Contributions

First sparse recovery guarantee for ReLU neural network weights

The authors establish theoretical conditions under which sparse weights of two-layer scalar-output ReLU networks can be uniquely identified and efficiently recovered using iterative hard thresholding (IHT). This includes both identifiability guarantees and convergence guarantees for the recovery algorithm under random Gaussian data.

10 retrieved papers
Structural properties enabling sparse weight recovery via IHT

The authors identify and formalize structural properties (restricted strong convexity and restricted smoothness) of sparse MLP weights that enable exact recovery. They prove these properties hold with high probability for networks satisfying certain sparsity and weight constraints when trained on Gaussian data.

10 retrieved papers
Can Refute
Memory-efficient IHT algorithm for sparse MLP training

The authors develop an iterative hard thresholding algorithm that recovers sparse network weights with memory requirements that scale linearly with sparsity rather than with the full network size. This contrasts with existing methods like iterative magnitude pruning that require training dense networks first.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

First sparse recovery guarantee for ReLU neural network weights

The authors establish theoretical conditions under which sparse weights of two-layer scalar-output ReLU networks can be uniquely identified and efficiently recovered using iterative hard thresholding (IHT). This includes both identifiability guarantees and convergence guarantees for the recovery algorithm under random Gaussian data.

Contribution

Structural properties enabling sparse weight recovery via IHT

The authors identify and formalize structural properties (restricted strong convexity and restricted smoothness) of sparse MLP weights that enable exact recovery. They prove these properties hold with high probability for networks satisfying certain sparsity and weight constraints when trained on Gaussian data.

Contribution

Memory-efficient IHT algorithm for sparse MLP training

The authors develop an iterative hard thresholding algorithm that recovers sparse network weights with memory requirements that scale linearly with sparsity rather than with the full network size. This contrasts with existing methods like iterative magnitude pruning that require training dense networks first.