Overparametrization bends the landscape: BBP transitions at initialization in simple Neural Networks
Overview
Overall Novelty Assessment
The paper contributes a field-theoretic analysis of Hessian spectra at initialization in overparametrized neural networks, identifying Baik–Ben Arous–Péché (BBP) transitions that separate informative from uninformative initialization regimes. It resides in the 'Phase Transitions and Critical Phenomena at Initialization' leaf, which contains only two papers total. This sparse population suggests the paper addresses a relatively specialized research direction within the broader Hessian analysis landscape, focusing on critical phenomena rather than general spectral characterization or empirical measurement.
The taxonomy tree reveals that the paper's immediate parent branch, 'Theoretical Characterization of Hessian Spectral Properties', contains a sibling leaf on 'Asymptotic Spectral Analysis and Random Matrix Theory' with three papers. Neighboring branches include 'Empirical Analysis of Hessian Structure' (three papers across two leaves) and 'Initialization Schemes and Their Impact' (four papers). The paper's use of field theory and random matrix techniques connects it to asymptotic spectral work, while its focus on overparameterization and information-theoretic thresholds distinguishes it from purely empirical eigenvalue distribution studies or initialization scheme proposals.
Among thirty candidates examined, none were found to clearly refute any of the three main contributions. Contribution A (BBP transitions in overparametrized networks) examined ten candidates with zero refutable matches; Contribution B (continuous versus discontinuous transitions) and Contribution C (weak-recovery threshold via infinite overparametrization) each examined ten candidates with identical outcomes. This suggests that within the limited search scope, the specific combination of BBP transition analysis, overparameterization effects, and information-theoretic threshold characterization appears relatively unexplored, though the absence of refutations does not guarantee exhaustive novelty.
Given the sparse taxonomy leaf (two papers) and zero refutations across thirty candidates, the work appears to occupy a distinct niche within Hessian initialization theory. However, the limited search scope means potentially relevant work in statistical physics, phase retrieval, or teacher-student frameworks outside the top-thirty semantic matches may not have been captured. The analysis reflects novelty within the examined literature but cannot rule out overlooked connections in adjacent fields.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors apply field-theoretic techniques to study the Hessian spectrum at initialization in a teacher-student setup with two-layer networks. They characterize the BBP transition that determines when random initialization contains information about the teacher signal, extending this analysis to overparametrized settings beyond standard phase retrieval.
The authors identify and distinguish two qualitatively different types of BBP transitions (continuous and discontinuous) that arise depending on overparametrization level and loss normalization. They show that higher overparametrization systematically leads to discontinuous transitions with strong finite-size effects.
The authors prove that in the limit of infinite overparametrization, the BBP transition threshold converges to the information-theoretic weak-recovery threshold. This shows that spectral analysis of the Hessian at initialization can match optimal recovery performance through overparametrization alone.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[11] Deconstructing the Goldilocks Zone of Neural Network Initialization PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Analysis of BBP transitions in overparametrized neural networks at initialization
The authors apply field-theoretic techniques to study the Hessian spectrum at initialization in a teacher-student setup with two-layer networks. They characterize the BBP transition that determines when random initialization contains information about the teacher signal, extending this analysis to overparametrized settings beyond standard phase retrieval.
[3] Towards quantifying the hessian structure of neural networks PDF
[4] Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond PDF
[5] Revisiting Initialization of Neural Networks PDF
[9] Shallow univariate ReLu networks as splines: initialization, loss surface, hessian, and gradient flow dynamics PDF
[11] Deconstructing the Goldilocks Zone of Neural Network Initialization PDF
[12] Shallow Univariate ReLu Networks as Splines: Initialization, Loss Surface, Hessian, & Gradient Flow Dynamics PDF
[13] The asymptotic spectrum of the hessian of dnn throughout training PDF
[17] The Challenges of the Nonlinear Regime for Physics-Informed Neural Networks PDF
[18] Vanishing Curvature and the Power of Adaptive Methods in Randomly Initialized Deep Networks PDF
[19] Fishing For Cheap And Efficient Pruners At Initialization PDF
Characterization of continuous versus discontinuous BBP transitions under overparametrization
The authors identify and distinguish two qualitatively different types of BBP transitions (continuous and discontinuous) that arise depending on overparametrization level and loss normalization. They show that higher overparametrization systematically leads to discontinuous transitions with strong finite-size effects.
[25] Overparameterized relu neural networks learn the simplest model: Neural isometry and phase transitions PDF
[30] Memorizing without overfitting: Bias, variance, and interpolation in overparameterized models PDF
[31] Learning through atypical phase transitions in overparameterized neural networks PDF
[32] Theory of overparametrization in quantum neural networks PDF
[33] Neural models for prediction of spatially patterned phase transitions: methods and challenges PDF
[34] Optimal generalisation and learning transition in extensive-width shallow neural networks near interpolation PDF
[35] Hidden progress in deep learning: Sgd learns parities near the computational limit PDF
[36] A jamming transition from under-to over-parametrization affects generalization in deep learning PDF
[37] Understanding pathologies of deep heteroskedastic regression PDF
[38] Bias-variance decomposition of overparameterized regression with random linear features PDF
Demonstration that infinite overparametrization achieves information-theoretic weak-recovery threshold
The authors prove that in the limit of infinite overparametrization, the BBP transition threshold converges to the information-theoretic weak-recovery threshold. This shows that spectral analysis of the Hessian at initialization can match optimal recovery performance through overparametrization alone.