Abstract:

For overparameterized linear regression with isotropic Gaussian design and minimum-p\ell_p interpolator p(1,2]p\in(1,2], we give a unified, high-probability characterization for the scaling of the family of parameter norms wp^rr[1,p]\\{ \lVert \widehat{w_p} \rVert_r \\}_{r \in [1,p]} with sample size.

We solve this basic, but unresolved question through a simple dual-ray analysis, which reveals a competition between a signal spike and a bulk of null coordinates in XYX^\top Y, yielding closed-form predictions for (i) a data-dependent transition nn_\star (the "elbow"), and (ii) a universal threshold r=2(p1)r_\star=2(p-1) that separates wp^r\lVert \widehat{w_p} \rVert_r's which plateau from those that continue to grow with an explicit exponent.

This unified solution resolves the scaling of all r\ell_r norms within the family r[1,p]r\in [1,p] under p\ell_p-biased interpolation, and explains in one picture which norms saturate and which increase as nn grows.

We then study diagonal linear networks (DLNs) trained by gradient descent. By calibrating the initialization scale α\alpha to an effective peff(α)p_{\mathrm{eff}}(\alpha) via the DLN separable potential, we show empirically that DLNs inherit the same elbow/threshold laws, providing a predictive bridge between explicit and implicit bias.

Given that many generalization proxies depend on wp^r\lVert \widehat {w_p} \rVert_r, our results suggest that their predictive power will depend sensitively on which lrl_r norm is used.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper provides a unified closed-form characterization of parameter norm scaling for minimum-ℓp interpolators in overparameterized linear regression with isotropic Gaussian design. It resides in the 'Closed-Form Characterizations for Isotropic Gaussian Design' leaf, which contains only three papers total (including this one). This represents a sparse, highly specialized research direction within the broader study of explicit bias via minimum-norm interpolation, suggesting the work addresses a focused theoretical gap in understanding how different ℓr norms scale across the family r∈[1,p].

The taxonomy reveals a single main branch ('Explicit Bias via Minimum-Norm Interpolation') with one active leaf, indicating limited diversification in this research area. The scope explicitly excludes implicit bias from optimization dynamics, positioning this work within a purely regularization-theoretic framework. The two sibling papers in the same leaf likely address related norm-scaling questions under similar Gaussian assumptions, but the taxonomy structure suggests neighboring directions (e.g., non-Gaussian designs, implicit bias from gradient descent) remain largely unexplored in the current literature base.

Among 25 candidates examined across three contributions, no refutable prior work was identified. The first contribution (unified scaling laws) examined 8 candidates with zero refutations; the dual-ray analysis examined 7 with none refuting; the diagonal linear network extension examined 10 with none refuting. This suggests that within the limited search scope—focused on top semantic matches and citations—the specific combination of unified ℓr-norm families, spike-bulk competition analysis, and the data-dependent transition n★ appears not to have direct precedent in the examined literature.

The analysis covers a narrow semantic neighborhood (25 papers) rather than an exhaustive survey of overparameterized regression. The absence of refutations reflects the search scope and the paper's technical specificity (e.g., the threshold r★=2(p−1), calibration via DLN separable potential) rather than a definitive claim of field-wide novelty. Broader connections to implicit bias, non-Gaussian settings, or empirical deep learning remain outside this assessment's purview.

Taxonomy

Core-task Taxonomy Papers
2
3
Claimed Contributions
25
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Scaling of parameter norms with sample size in overparameterized linear regression. This field examines how the magnitude of learned parameters behaves as training data grows in settings where the number of parameters exceeds the number of samples. The taxonomy centers on a single main branch, Explicit Bias via Minimum-Norm Interpolation, which focuses on characterizing the implicit regularization that arises when fitting overparameterized models by selecting the minimum-norm solution among all interpolating predictors. Within this branch, researchers derive closed-form expressions and asymptotic scaling laws, often under idealized design assumptions such as isotropic Gaussian features, to understand how parameter norms evolve with sample size and to connect these norms to generalization performance. A particularly active line of work within this branch investigates closed-form characterizations for isotropic Gaussian design matrices, where the statistical structure of the data enables precise mathematical analysis. Norm Scaling Overparameterized[0] sits squarely in this cluster, providing explicit scaling results that complement closely related studies such as Norm Scaling Overparameterized[1] and Norm Scaling Overparameterized[2], which also explore norm behavior under similar design conditions. The main themes across these works involve trade-offs between model complexity, sample efficiency, and the stability of minimum-norm solutions, with open questions remaining about how these insights extend to more realistic, non-Gaussian or anisotropic settings. By focusing on tractable Gaussian scenarios, Norm Scaling Overparameterized[0] contributes foundational understanding of parameter scaling that may inform broader theories of implicit bias in overparameterized learning.

Claimed Contributions

Unified closed-form scaling laws for parameter norm families under lp bias

The authors derive the first unified closed-form scaling laws characterizing how the entire family of lr norms scales with sample size for minimum-lp interpolators in overparameterized linear regression. They identify a universal threshold r⋆ = 2(p−1) separating norms that plateau from those that grow, and provide explicit expressions for transition size n⋆ and growth exponents in both spike- and bulk-dominated regimes.

8 retrieved papers
Dual-ray analysis revealing spike-bulk competition

The authors introduce a one-dimensional dual-ray analysis technique that exposes the competition between signal spike and bulk null coordinates in X⊤Y. This analysis yields closed-form predictions for both a data-dependent transition point n⋆ and the universal threshold r⋆ that determines which norms plateau versus continue growing.

7 retrieved papers
Extension to diagonal linear networks via initialization-to-geometry calibration

The authors extend their theoretical framework to diagonal linear networks trained by gradient descent by developing a calibration map from initialization scale α to an effective geometry parameter peff(α). This calibration demonstrates that DLNs exhibit the same elbow and threshold behavior as explicit minimum-lp interpolation, providing a predictive bridge between explicit and implicit bias.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Unified closed-form scaling laws for parameter norm families under lp bias

The authors derive the first unified closed-form scaling laws characterizing how the entire family of lr norms scales with sample size for minimum-lp interpolators in overparameterized linear regression. They identify a universal threshold r⋆ = 2(p−1) separating norms that plateau from those that grow, and provide explicit expressions for transition size n⋆ and growth exponents in both spike- and bulk-dominated regimes.

Contribution

Dual-ray analysis revealing spike-bulk competition

The authors introduce a one-dimensional dual-ray analysis technique that exposes the competition between signal spike and bulk null coordinates in X⊤Y. This analysis yields closed-form predictions for both a data-dependent transition point n⋆ and the universal threshold r⋆ that determines which norms plateau versus continue growing.

Contribution

Extension to diagonal linear networks via initialization-to-geometry calibration

The authors extend their theoretical framework to diagonal linear networks trained by gradient descent by developing a calibration map from initialization scale α to an effective geometry parameter peff(α). This calibration demonstrates that DLNs exhibit the same elbow and threshold behavior as explicit minimum-lp interpolation, providing a predictive bridge between explicit and implicit bias.