A Recovery Guarantee for Sparse Neural Networks
Overview
Overall Novelty Assessment
The paper establishes the first provable guarantees for recovering sparse weights in two-layer ReLU networks using iterative hard thresholding. It resides in the 'Theoretical Recovery Guarantees' leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy of 50 papers. This leaf focuses specifically on provable algorithms for exact or approximate weight recovery, distinguishing it from empirical pruning methods and training-time sparsification techniques that dominate other branches of the field.
The taxonomy reveals that neighboring work clusters around compressed sensing with generative models, sparse signal recovery applications, and training-induced sparsity methods. The paper's theoretical recovery focus diverges from the more crowded pruning and compression branches, which emphasize post-training interventions without recovery guarantees. Its sibling papers in the same leaf address one-bit compressed sensing variants and learning guarantees for shallow networks, suggesting the paper extends classical compressed sensing ideas to ReLU network weight recovery rather than exploring activation sparsity or architectural design.
Among 30 candidates examined, the contribution on structural properties enabling IHT recovery shows one refutable candidate, while the other two contributions—first recovery guarantee and memory-efficient IHT algorithm—show no clear refutations across 10 candidates each. This limited search scope suggests the core recovery guarantee appears relatively novel within the examined literature, though the structural conditions may overlap with existing compressed sensing or shallow network learning theory. The memory efficiency claim faces no direct prior work among the candidates reviewed.
Based on the top-30 semantic matches and citation expansion, the work appears to occupy a sparsely populated niche at the intersection of compressed sensing theory and neural network weight recovery. The analysis does not cover exhaustive literature on general sparse optimization or broader pruning methods, which may contain relevant but less directly comparable prior work. The taxonomy structure confirms this sits in a less crowded theoretical corner compared to empirical pruning or training-time regularization branches.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors establish theoretical conditions under which sparse weights of two-layer scalar-output ReLU networks can be uniquely identified and efficiently recovered using iterative hard thresholding (IHT). This includes both identifiability guarantees and convergence guarantees for the recovery algorithm under random Gaussian data.
The authors identify and formalize structural properties (restricted strong convexity and restricted smoothness) of sparse MLP weights that enable exact recovery. They prove these properties hold with high probability for networks satisfying certain sparsity and weight constraints when trained on Gaussian data.
The authors develop an iterative hard thresholding algorithm that recovers sparse network weights with memory requirements that scale linearly with sparsity rather than with the full network size. This contrasts with existing methods like iterative magnitude pruning that require training dense networks first.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[21] Robust one-bit recovery via ReLU generative networks: Near-optimal statistical rate and global landscape analysis PDF
[35] Robust one-bit recovery via relu generative networks: Improved statistical rates and global landscape analysis PDF
[40] Learning and recovery in the ReLU model PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
First sparse recovery guarantee for ReLU neural network weights
The authors establish theoretical conditions under which sparse weights of two-layer scalar-output ReLU networks can be uniquely identified and efficiently recovered using iterative hard thresholding (IHT). This includes both identifiability guarantees and convergence guarantees for the recovery algorithm under random Gaussian data.
[4] On Sparsity in Overparametrised Shallow ReLU Networks PDF
[5] ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models PDF
[7] Inducing and exploiting activation sparsity for fast inference on deep neural networks PDF
[15] Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning PDF
[30] Overparameterized ReLU Neural Networks Learn the Simplest Model: Neural Isometry and Phase Transitions PDF
[71] Algorithmic and theoretical aspects of sparse deep neural networks PDF
[72] Automatic sparse connectivity learning for neural networks PDF
[73] Efficient Sparse-Winograd Convolutional Neural Networks PDF
[74] Trajectory growth lower bounds for random sparse deep ReLU networks PDF
[75] Norm-based Generalization Bounds for Compositionally Sparse Neural Networks PDF
Structural properties enabling sparse weight recovery via IHT
The authors identify and formalize structural properties (restricted strong convexity and restricted smoothness) of sparse MLP weights that enable exact recovery. They prove these properties hold with high probability for networks satisfying certain sparsity and weight constraints when trained on Gaussian data.
[64] On Iterative Hard Thresholding Methods for High-dimensional M-Estimation PDF
[61] Structured Sparse Regression via Greedy Hard Thresholding PDF
[62] Learning Sparse Distributions using Iterative Hard Thresholding PDF
[63] Deep greedy unfolding: Sorting out argsorting in greedy sparse recovery algorithms PDF
[65] Nonlinear structured signal estimation in high dimensions via iterative hard thresholding PDF
[66] Robust 1-bit Compressed Sensing with Iterative Hard Thresholding PDF
[67] Accelerated iterative hard thresholding PDF
[68] FISTA-Net: Learning a fast iterative shrinkage thresholding network for inverse problems in imaging PDF
[69] Sparse CCA via Precision Adjusted Iterative Thresholding PDF
[70] Learning sparse generalized linear models with binary outcomes via iterative hard thresholding PDF
Memory-efficient IHT algorithm for sparse MLP training
The authors develop an iterative hard thresholding algorithm that recovers sparse network weights with memory requirements that scale linearly with sparsity rather than with the full network size. This contrasts with existing methods like iterative magnitude pruning that require training dense networks first.