Learning on a Razor’s Edge: Identifiability and Singularity of Polynomial Neural Networks

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.8 Download Report PDF

identifiabilitysingularitiescritical pointsneuromanifoldspolynomial activationalgebraic geometry

We study function spaces parametrized by neural networks, referred to as neuromanifolds. Specifically, we focus on deep Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) with an activation function that is a sufficiently generic polynomial. First, we address the identifiability problem, showing that, for almost all functions in the neuromanifold of an MLP, there exist only finitely many parameter choices yielding that function. For CNNs, the parametrization is generically one-to-one. As a consequence, we compute the dimension of the neuromanifold. Second, we describe singular points of neuromanifolds. We characterize singularities completely for CNNs, and partially for MLPs. In both cases, they arise from sparse subnetworks. For MLPs, we prove that these singularities often correspond to critical points of the mean-squared error loss, which does not hold for CNNs. This provides a geometric explanation of the sparsity bias of MLPs. All of our results leverage tools from algebraic geometry.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper establishes generic finite identifiability for polynomial MLPs and characterizes singularities arising from sparse subnetworks, connecting these geometric features to critical points of mean-squared error loss. Within the taxonomy, it resides in the 'Deep Multi-Layer Perceptron Identifiability' leaf alongside one sibling paper examining similar polynomial MLP structures. This leaf is part of a broader 'Identifiability Analysis' branch containing four leaves across different architectures, indicating moderate research activity in identifiability questions but relatively sparse focus on deep polynomial MLPs specifically.

The taxonomy reveals neighboring research directions that contextualize this work's scope. The sibling 'Convolutional and Attention Architecture Identifiability' leaf addresses CNNs with polynomial activations, while 'Non-Polynomial Activation Identifiability' examines ReLU networks through piecewise-affine methods rather than algebraic geometry. A parallel 'Singularity Structure and Geometric Properties' branch contains leaves for singularity characterization, dimension computation, and neuroalgebraic frameworks, suggesting the paper bridges identifiability analysis with geometric singularity theory. The taxonomy's scope notes clarify that polynomial activation methods are distinct from ReLU-based approaches and that singularity analysis is separated from pure identifiability proofs.

Among five candidates examined through limited semantic search, the 'Characterization of singularities via sparse subnetworks' contribution shows one refutable candidate among four examined, indicating some prior work addresses singularity-sparsity connections. The 'Generic finite identifiability for polynomial MLPs' contribution examined one candidate with no clear refutation, suggesting this result may occupy less-explored territory within the limited search scope. The 'Critically exposed parameter sets and sparsity bias explanation' contribution was not tested against any candidates, leaving its novelty assessment incomplete. These statistics reflect a constrained literature search rather than exhaustive coverage of the field.

Based on the limited examination of five candidates, the work appears to contribute novel connections between identifiability, singularity structure, and sparsity bias for polynomial MLPs, though one contribution shows overlap with existing singularity characterization research. The taxonomy structure suggests this sits at an intersection of identifiability analysis and geometric properties, a moderately active area with approximately fifteen papers across ten research directions. The analysis cannot assess novelty against the broader algebraic geometry or neural network theory literature beyond the examined candidates.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: identifiability and singularity of polynomial neural network neuromanifolds. This field investigates the geometric and algebraic structure of parameter spaces in polynomial neural networks, focusing on when distinct parameter configurations yield identical input-output mappings (identifiability) and where the parameter manifold exhibits degenerate or singular behavior. The taxonomy organizes research into three main branches: Identifiability Analysis examines conditions under which network parameters can be uniquely recovered from observations, with works like Deep Polynomial Identifiability[1] and Deep ReLU Identifiability[5] exploring multi-layer architectures; Singularity Structure and Geometric Properties studies the manifold geometry through lenses such as Neuromanifold Geometry[3], Polynomial Convolutional Geometry[4], and algebraic frameworks like Neuroalgebraic Geometry[6] and Alexander Hirschowitz Neurovarieties[12]; and Learning Dynamics and Statistical Properties connects these geometric features to training behavior, generalization, and phase transitions, as seen in Singularity Bias[2], Grokking Phase Transitions[9], and classical results like Singularities Generalization Error[15]. A particularly active line of work centers on deep polynomial architectures, where identifiability becomes increasingly subtle with depth and nonlinearity. Razors Edge Identifiability[0] sits within this cluster, closely aligned with Deep Polynomial Identifiability[1] in analyzing multi-layer perceptron structures, yet it appears to emphasize finer boundary conditions or critical regimes where identifiability barely holds or fails. This contrasts with broader geometric studies like Neuromanifold Geometry[3], which map out singularity loci without focusing exclusively on identifiability questions, and with works such as Singularity Detection[8] or Structure Symmetry Singularity[10], which develop computational or symmetry-based tools for locating singular points. The interplay between identifiability constraints and singularity structure remains a central open question: understanding how non-identifiable regions (explored since Nonidentifiable Algebraic Analysis[14]) influence learning dynamics and whether singularities act as attractors or barriers during optimization.

Claimed Contributions

Generic finite identifiability for polynomial MLPs

1 retrieved paper

The authors prove that for MLPs with sufficiently generic polynomial activations of large degree, almost all functions correspond to only finitely many parameter choices. This resolves the dimension conjecture by showing the neuromanifold dimension equals the number of parameters.

1 retrieved paper

Characterization of singularities via sparse subnetworks

Can Refute

4 retrieved papers

The authors characterize singular points of neuromanifolds, proving that sparse subnetworks (where certain neurons are inactive) correspond to singularities under appropriate architectural assumptions. For CNNs, this characterization is complete; for MLPs, it is partial.

4 retrieved papers

Can Refute

Notion of critically exposed parameter sets and sparsity bias explanation

0 retrieved papers

The authors introduce the concept of critically exposed parameter sets and prove that MLP subnetworks are critically exposed while CNN subnetworks are not. This provides a geometric explanation for the sparsity bias observed in MLPs but not in pure single-channel CNNs.

0 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Identifiability of Deep Polynomial Neural Networks PDF

Usevich, Konstantin, Borsoi, Ricardo, Konstantin Usevich, Clara D'erand, Clausel, Marianne, Ricardo Borsoi, Marianne Clausel (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Generic finite identifiability for polynomial MLPs

[18] The determination of multivariable nonlinear models for dynamic systems using neural networks PDF

Cannot Refute

Contribution

Characterization of singularities via sparse subnetworks

[2] Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks PDF

Can Refute

[13] Manifolds of Learning PDF

Cannot Refute

[16] Stochastic collapse: How gradient noise attracts sgd dynamics towards simpler subnetworks PDF

Cannot Refute

[17] Singular value representation: A new graph perspective on neural networks PDF

Cannot Refute

Contribution

Learning on a Razor’s Edge: Identifiability and Singularity of Polynomial Neural Networks

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Identifiability of Deep Polynomial Neural Networks PDF

Contribution Analysis

Generic finite identifiability for polynomial MLPs

[18] The determination of multivariable nonlinear models for dynamic systems using neural networks PDF

Characterization of singularities via sparse subnetworks

[2] Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks PDF

[13] Manifolds of Learning PDF

[16] Stochastic collapse: How gradient noise attracts sgd dynamics towards simpler subnetworks PDF

[17] Singular value representation: A new graph perspective on neural networks PDF

Notion of critically exposed parameter sets and sparsity bias explanation

Table of Contents