Predicting Kernel Regression Learning Curves from Only Raw Data Statistics

ICLR 2026 Conference SubmissionAnonymous Authors
kernelskernel regressionneural tangent kerneleigenstructurelearning curvesnatural dataMLPs
Abstract:

We study kernel regression with common rotation-invariant kernels on real datasets including CIFAR-5m, SVHN, and ImageNet. We give a theoretical framework that predicts learning curves (test risk vs. sample size) from only two measurements: the empirical data covariance matrix and an empirical polynomial decomposition of the target function ff_*. The key new idea is an analytical approximation of a kernel’s eigenvalues and eigenfunctions with respect to an anisotropic data distribution. The eigenfunctions resemble Hermite polynomials of the data, so we call this approximation the \textit{Hermite eigenstructure ansatz} (HEA). We prove the HEA for Gaussian data, but we find that real image data is often ``Gaussian enough’’ for the HEA to hold well in practice, enabling us to predict learning curves by applying prior results relating kernel eigenstructure to test risk. Extending beyond kernel regression, we empirically find that MLPs in the feature-learning regime learn Hermite polynomials in the order predicted by the HEA. Our HEA framework is a proof of concept that an end-to-end theory of learning which maps dataset structure all the way to model performance is possible for nontrivial learning algorithms on real datasets.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces the Hermite eigenstructure ansatz (HEA) to predict kernel regression learning curves from empirical data covariance and polynomial decompositions of the target function. It resides in the 'Kernel Eigenstructure and Data Distribution Modeling' leaf, which contains only two papers total. This leaf sits within the broader 'Theoretical Foundations of Kernel Regression Learning Curves' branch, indicating a relatively sparse research direction focused on mechanistic prediction frameworks rather than purely asymptotic or statistical mechanics approaches. The small sibling count suggests this specific angle—deriving eigenstructure approximations for anisotropic real-world data—is not yet crowded.

The taxonomy reveals neighboring leaves in 'Spectral and Statistical Mechanics Approaches' (four papers) and 'Asymptotic and Power-Law Analysis' (four papers), which address generalization error through replica methods or power-law spectral decay assumptions. The original work diverges by proposing an analytical approximation (HEA) tailored to rotation-invariant kernels and anisotropic distributions, rather than relying on asymptotic limits or generic spectral decompositions. The 'Empirical Analysis and Validation' branch (four papers across two leaves) focuses on measuring exponents on benchmarks, whereas this paper aims to predict curves from raw statistics, bridging theory and empirical structure more directly.

Among 27 candidates examined, no contribution was clearly refuted. The HEA for rotation-invariant kernels (7 candidates, 0 refutable) and theoretical proofs for Gaussian data (10 candidates, 0 refutable) appear novel within the limited search scope. The learning curve prediction framework (10 candidates, 0 refutable) also shows no substantial prior overlap. The analysis does not claim exhaustive coverage—only that top-K semantic matches and citation expansion yielded no direct precedents. The sparse taxonomy leaf and zero refutations suggest the HEA concept and its application to real image data are relatively unexplored in the examined literature.

Given the limited search scale (27 candidates) and the paper's placement in a two-paper leaf, the work appears to occupy a distinct niche within kernel regression theory. The taxonomy structure indicates that while spectral and asymptotic methods are established, mechanistic prediction from data statistics via Hermite approximations is less developed. Acknowledging the search scope, the analysis suggests the HEA framework and its empirical validation on real datasets represent a substantive contribution, though a broader literature review might reveal related ideas in adjacent fields not captured by the current taxonomy.

Taxonomy

Core-task Taxonomy Papers
33
3
Claimed Contributions
27
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: predicting kernel regression learning curves from data statistics. The field organizes around several complementary perspectives. Theoretical Foundations examine how kernel eigenstructure and data distribution shape asymptotic behavior, often drawing on statistical mechanics and spectral analysis to characterize generalization as sample size grows. Empirical Analysis and Validation test these predictions against real datasets, documenting power-law decay and other scaling phenomena. Algorithmic Extensions explore optimization strategies and adaptive methods that exploit learning curve structure, while Kernel Design and Selection address how kernel choice influences curve shape. Applied Forecasting and Regression demonstrate these ideas in domains ranging from energy prediction to load forecasting, and Related Statistical and Machine Learning Methods connect kernel regression to broader themes in nonparametric estimation and neural scaling laws. Recent work highlights tension between universal scaling principles and task-specific structure. Studies like Functional Scaling Laws[3] and Scaling Laws Redundancy[6] investigate how data redundancy and functional form govern asymptotic rates, while Spectral Bias Task Alignment[1] and Spectrum Dependent Curves[14] emphasize that eigenspectrum alignment between kernel and target determines convergence speed. Predicting Kernel Learning Curves[0] sits within the theoretical branch focused on kernel eigenstructure and data distribution modeling, closely aligned with Comprehensive Learning Curve Analysis[12], which also examines how distributional properties drive predictive accuracy. Compared to purely empirical approaches like Power Law Decay[5], the original work emphasizes deriving curve predictions directly from statistical summaries of the data, offering a more mechanistic view of how sample complexity unfolds in kernel methods.

Claimed Contributions

Hermite eigenstructure ansatz (HEA) for rotation-invariant kernels

The authors introduce an analytical approximation that expresses kernel eigenvalues and eigenfunctions in terms of Hermite polynomials of the data. This ansatz depends only on the empirical data covariance matrix and kernel level coefficients, enabling prediction of kernel eigenstructure without constructing or diagonalizing kernel matrices.

7 retrieved papers
Theoretical proofs of HEA for Gaussian data

The authors formally prove that the Hermite eigenstructure ansatz holds exactly for Gaussian data distributions in two limiting regimes: for wide Gaussian kernels and for dot-product kernels with fast-decaying level coefficients. These theorems provide rigorous justification for when the ansatz is valid.

10 retrieved papers
Learning curve prediction framework from raw data statistics

The authors develop an end-to-end framework that maps minimal dataset statistics directly to kernel regression performance predictions. By combining the HEA with existing kernel eigenframework results, they predict test error curves using only data covariance and target function decomposition, without requiring kernel matrix construction.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Hermite eigenstructure ansatz (HEA) for rotation-invariant kernels

The authors introduce an analytical approximation that expresses kernel eigenvalues and eigenfunctions in terms of Hermite polynomials of the data. This ansatz depends only on the empirical data covariance matrix and kernel level coefficients, enabling prediction of kernel eigenstructure without constructing or diagonalizing kernel matrices.

Contribution

Theoretical proofs of HEA for Gaussian data

The authors formally prove that the Hermite eigenstructure ansatz holds exactly for Gaussian data distributions in two limiting regimes: for wide Gaussian kernels and for dot-product kernels with fast-decaying level coefficients. These theorems provide rigorous justification for when the ansatz is valid.

Contribution

Learning curve prediction framework from raw data statistics

The authors develop an end-to-end framework that maps minimal dataset statistics directly to kernel regression performance predictions. By combining the HEA with existing kernel eigenframework results, they predict test error curves using only data covariance and target function decomposition, without requiring kernel matrix construction.