Beyond Spectra: Eigenvector Overlaps in Loss Geometry

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

hessianoverlapeigenvectorgeometryridge regressionnoisefree probabilityalgorithmsCIFARhigh dimensional statisticsgeneralizationcovariate shiftdouble descentmultiple descentrandom matrix theory

Local loss geometry in machine learning is fundamentally a two-operator concept. When only a single loss is considered, geometry is fully summarized by the Hessian spectrum; in practice, however, both training and test losses are relevant, and the resulting geometry depends on their spectra together with the alignment of their eigenspaces. We first establish general foundations for two-loss geometry by formulating a universal local fluctuation law, showing that the expected test-loss increment under small training perturbations is a trace that combines train and test spectral data with a critical additional factor quantifying eigenspace overlap, and by proving a novel transfer law that describes how overlaps transform in response to noise. As a solvable analytical model, we next apply these laws to ridge regression with arbitrary covariate shift, where operator-valued free probability yields asymptotically exact overlap decompositions that reveal overlaps as the natural quantities specifying shift and that resolve the puzzle of multiple descent: peaks are controlled by eigenspace (mis-)alignment rather than by Hessian ill-conditioning alone. Finally, for empirical validation and scalability, we confirm the fluctuation law in multilayer perceptrons, develop novel algorithms based on subspace iteration and kernel polynomial methods to estimate overlap functionals, and apply them to a ResNet-20 trained on CIFAR10, showing that class imbalance reshapes train–test loss geometry via induced misalignment. Together, these results establish overlaps as the critical missing ingredient for understanding local loss geometry, providing both theoretical foundations and scalable estimators for analyzing generalization in modern neural networks.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper develops a two-loss framework for analyzing local loss geometry through spectral properties and eigenspace overlaps between training and test Hessians. It resides in the 'Universal Laws for Train-Test Loss Interaction' leaf, which currently contains only this work as its sole member. This positioning reflects a sparse research direction within the broader theoretical foundations branch, suggesting the paper addresses a relatively unexplored formalization of multi-operator loss geometry compared to adjacent areas like nonlinear feature map theory or asymptotic learning under distributional mismatch.

The taxonomy reveals neighboring theoretical work in spiked covariance models and asymptotic learning theory, both examining spectral structure but through different lenses—nonlinear feature propagation and distributional assumptions respectively. The paper's emphasis on universal fluctuation and transfer laws distinguishes it from these sibling branches by focusing on general operator-algebraic relationships rather than model-specific derivations. Parallel branches on optimization and empirical analysis explore eigenspace control and landscape visualization, providing complementary perspectives that manipulate or measure what this work characterizes theoretically.

Among eighteen candidates examined across three contributions, none were identified as clearly refuting the proposed framework. The two-loss geometry formulation examined ten candidates with zero refutations, the universal laws examined one candidate with zero refutations, and the scalable algorithms examined seven candidates with zero refutations. This limited search scope suggests the specific combination of spectral data with eigenspace overlap quantification, formalized through universal laws, appears distinct within the examined literature, though the small candidate pool (particularly one candidate for the core theoretical laws) limits confidence in comprehensiveness.

Based on top-eighteen semantic matches, the work appears to occupy a novel theoretical niche formalizing train-test eigenspace interactions through universal laws. The sparse taxonomy leaf and absence of refuting candidates suggest originality, though the limited search scale—especially the single candidate examined for the central fluctuation and transfer laws—means potentially relevant prior work in operator theory or random matrix methods may exist beyond this scope.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: eigenvector overlaps in train-test loss geometry. This field investigates how the geometric structure of training and test loss surfaces—particularly their Hessian eigenspaces—interact to govern generalization. The taxonomy organizes work into four main branches: theoretical foundations that derive universal laws for multi-operator loss interactions, optimization and regularization methods that explicitly control eigenspace properties, empirical studies that measure and visualize loss landscape geometry across architectures, and domain-specific applications that leverage eigenspace analysis for tasks such as metric learning or continual learning. Theoretical Foundations of Multi-Operator Loss Geometry, where Eigenvector Overlaps[0] resides, focuses on deriving rigorous relationships between train and test Hessians, often using random matrix theory or asymptotic analysis. Works like Spiked Covariance Propagation[1] and Learning Asymptotics[7] exemplify this branch by characterizing how spectral structure propagates through layers or emerges in high-dimensional limits. Meanwhile, branches on optimization and empirical analysis explore how eigenvalue regularization (e.g., Eigenvalue Regularization SAM[2]) or curvature measurements (e.g., Loss Landscape Curvature[6]) can be used to flatten loss surfaces and improve generalization. A central theme across these branches is understanding the trade-off between sharpness in the training loss and alignment of train-test eigenspaces. Some studies emphasize large flat regions (Large Geometric Vicinity[3]) as proxies for generalization, while others probe finer spectral overlaps to predict test performance. Eigenvector Overlaps[0] sits squarely within the theoretical foundations branch, focusing on universal laws that quantify how eigenvector alignment between train and test Hessians influences generalization gaps. Its emphasis on rigorous multi-operator interaction contrasts with more empirical works like Loss Landscapes Generalization[4], which catalog landscape features across datasets, and with application-driven studies such as Rethinking Metric Learning[5], which apply eigenspace insights to embedding spaces. By deriving universal overlap laws, Eigenvector Overlaps[0] provides a principled lens for interpreting the geometric underpinnings of generalization, complementing both optimization-focused and measurement-focused lines of inquiry.

Claimed Contributions

Two-loss framework for local loss geometry incorporating spectra and overlaps

10 retrieved papers

The authors propose a framework that characterizes local loss geometry using both training and test losses, showing that geometry depends not only on Hessian spectra but critically on eigenvector overlaps between train and test Hessians. This corrects the common practice of treating spectra alone as sufficient for understanding loss geometry.

10 retrieved papers

Universal local fluctuation law and overlap transfer law

1 retrieved paper

The authors establish two fundamental theoretical results: a fluctuation law (Theorem 1) expressing expected test loss increment as a trace combining train/test spectra with eigenvector overlaps, and a transfer law (Theorem 2) describing how overlaps transform under noise using free probability theory.

1 retrieved paper

Scalable algorithms for estimating Hessian eigenvector overlaps

7 retrieved papers

The authors introduce computational methods combining subspace iteration for outlier eigenspaces and a generalized kernel polynomial method for bulk eigenspaces, enabling efficient estimation of overlap functions between pairs of Hessians in networks with millions of parameters without forming matrices explicitly.

7 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Two-loss framework for local loss geometry incorporating spectra and overlaps

[8] An investigation into neural net optimization via hessian eigenvalue density PDF

Cannot Refute

[9] Evaluating loss landscapes from a topology perspective PDF

Cannot Refute

[10] Dcreg: Decoupled characterization for efficient degenerate lidar registration PDF

Cannot Refute

[11] Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods PDF

Cannot Refute

[12] Pyhessian: Neural networks through the lens of the hessian PDF

Cannot Refute

[13] Shaping the learning landscape in neural networks around wide flat minima PDF

Cannot Refute

[14] What Makes Looped Transformers Perform Better Than Non-Recursive Ones PDF

Cannot Refute

[15] Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks PDF

Cannot Refute

[16] Gradient descent happens in a tiny subspace PDF

Cannot Refute

[17] How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective PDF

Cannot Refute

Contribution

Universal local fluctuation law and overlap transfer law

[18] Integrable Structure of the Overlaps for Integrable Non-Hermitian Random Matrices and Zeros of Random Power Series with Finitely Dependent Gaussian â¦ PDF

Cannot Refute

Contribution

Scalable algorithms for estimating Hessian eigenvector overlaps

[11] Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods PDF

Cannot Refute

[12] Pyhessian: Neural networks through the lens of the hessian PDF

Cannot Refute

[16] Gradient descent happens in a tiny subspace PDF

Cannot Refute

[19] A Spectral Theory of Neural Prediction and Alignment PDF

Cannot Refute

[20] High-dimensional SGD aligns with emerging outlier eigenspaces PDF

Cannot Refute

[21] Hessian Eigenvectors and Principal Component Analysis of Neural Network Weight Matrices PDF

Cannot Refute

[22] Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification PDF

Cannot Refute

Beyond Spectra: Eigenvector Overlaps in Loss Geometry

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Two-loss framework for local loss geometry incorporating spectra and overlaps

[8] An investigation into neural net optimization via hessian eigenvalue density PDF

[9] Evaluating loss landscapes from a topology perspective PDF

[10] Dcreg: Decoupled characterization for efficient degenerate lidar registration PDF

[11] Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods PDF

[12] Pyhessian: Neural networks through the lens of the hessian PDF

[13] Shaping the learning landscape in neural networks around wide flat minima PDF

[14] What Makes Looped Transformers Perform Better Than Non-Recursive Ones PDF

[15] Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks PDF

[16] Gradient descent happens in a tiny subspace PDF

[17] How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective PDF

Universal local fluctuation law and overlap transfer law

[18] Integrable Structure of the Overlaps for Integrable Non-Hermitian Random Matrices and Zeros of Random Power Series with Finitely Dependent Gaussian â¦ PDF

Scalable algorithms for estimating Hessian eigenvector overlaps

[11] Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods PDF

[12] Pyhessian: Neural networks through the lens of the hessian PDF

[16] Gradient descent happens in a tiny subspace PDF

[19] A Spectral Theory of Neural Prediction and Alignment PDF

[20] High-dimensional SGD aligns with emerging outlier eigenspaces PDF

[21] Hessian Eigenvectors and Principal Component Analysis of Neural Network Weight Matrices PDF

[22] Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification PDF

Table of Contents

[18] Integrable Structure of the Overlaps for Integrable Non-Hermitian Random Matrices and Zeros of Random Power Series with Finitely Dependent Gaussian â¦ PDF