Learning on a Razor’s Edge: Identifiability and Singularity of Polynomial Neural Networks
Overview
Overall Novelty Assessment
The paper establishes generic finite identifiability for polynomial MLPs and characterizes singularities arising from sparse subnetworks, connecting these geometric features to critical points of mean-squared error loss. Within the taxonomy, it resides in the 'Deep Multi-Layer Perceptron Identifiability' leaf alongside one sibling paper examining similar polynomial MLP structures. This leaf is part of a broader 'Identifiability Analysis' branch containing four leaves across different architectures, indicating moderate research activity in identifiability questions but relatively sparse focus on deep polynomial MLPs specifically.
The taxonomy reveals neighboring research directions that contextualize this work's scope. The sibling 'Convolutional and Attention Architecture Identifiability' leaf addresses CNNs with polynomial activations, while 'Non-Polynomial Activation Identifiability' examines ReLU networks through piecewise-affine methods rather than algebraic geometry. A parallel 'Singularity Structure and Geometric Properties' branch contains leaves for singularity characterization, dimension computation, and neuroalgebraic frameworks, suggesting the paper bridges identifiability analysis with geometric singularity theory. The taxonomy's scope notes clarify that polynomial activation methods are distinct from ReLU-based approaches and that singularity analysis is separated from pure identifiability proofs.
Among five candidates examined through limited semantic search, the 'Characterization of singularities via sparse subnetworks' contribution shows one refutable candidate among four examined, indicating some prior work addresses singularity-sparsity connections. The 'Generic finite identifiability for polynomial MLPs' contribution examined one candidate with no clear refutation, suggesting this result may occupy less-explored territory within the limited search scope. The 'Critically exposed parameter sets and sparsity bias explanation' contribution was not tested against any candidates, leaving its novelty assessment incomplete. These statistics reflect a constrained literature search rather than exhaustive coverage of the field.
Based on the limited examination of five candidates, the work appears to contribute novel connections between identifiability, singularity structure, and sparsity bias for polynomial MLPs, though one contribution shows overlap with existing singularity characterization research. The taxonomy structure suggests this sits at an intersection of identifiability analysis and geometric properties, a moderately active area with approximately fifteen papers across ten research directions. The analysis cannot assess novelty against the broader algebraic geometry or neural network theory literature beyond the examined candidates.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors prove that for MLPs with sufficiently generic polynomial activations of large degree, almost all functions correspond to only finitely many parameter choices. This resolves the dimension conjecture by showing the neuromanifold dimension equals the number of parameters.
The authors characterize singular points of neuromanifolds, proving that sparse subnetworks (where certain neurons are inactive) correspond to singularities under appropriate architectural assumptions. For CNNs, this characterization is complete; for MLPs, it is partial.
The authors introduce the concept of critically exposed parameter sets and prove that MLP subnetworks are critically exposed while CNN subnetworks are not. This provides a geometric explanation for the sparsity bias observed in MLPs but not in pure single-channel CNNs.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[1] Identifiability of Deep Polynomial Neural Networks PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Generic finite identifiability for polynomial MLPs
The authors prove that for MLPs with sufficiently generic polynomial activations of large degree, almost all functions correspond to only finitely many parameter choices. This resolves the dimension conjecture by showing the neuromanifold dimension equals the number of parameters.
[18] The determination of multivariable nonlinear models for dynamic systems using neural networks PDF
Characterization of singularities via sparse subnetworks
The authors characterize singular points of neuromanifolds, proving that sparse subnetworks (where certain neurons are inactive) correspond to singularities under appropriate architectural assumptions. For CNNs, this characterization is complete; for MLPs, it is partial.
[2] Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks PDF
[13] Manifolds of Learning PDF
[16] Stochastic collapse: How gradient noise attracts sgd dynamics towards simpler subnetworks PDF
[17] Singular value representation: A new graph perspective on neural networks PDF
Notion of critically exposed parameter sets and sparsity bias explanation
The authors introduce the concept of critically exposed parameter sets and prove that MLP subnetworks are critically exposed while CNN subnetworks are not. This provides a geometric explanation for the sparsity bias observed in MLPs but not in pure single-channel CNNs.