GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

Out-of-Distribution (OOD) detectionNeural Tangent Kernel (NTK)

We introduce GradPCA, an Out-of-Distribution (OOD) detection method that exploits the low-rank structure of neural network gradients induced by Neural Tangent Kernel (NTK) alignment. GradPCA applies Principal Component Analysis (PCA) to gradient class-means, achieving more consistent performance than existing methods across standard image classification benchmarks. We provide a theoretical perspective on spectral OOD detection in neural networks to support GradPCA, highlighting feature-space properties that enable effective detection and naturally emerge from NTK alignment. Our analysis further reveals that feature quality—particularly the use of pretrained versus non-pretrained representations—plays a crucial role in determining which detectors will succeed. Extensive experiments validate the strong performance of GradPCA, and our theoretical framework offers guidance for designing more principled spectral OOD detectors.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces GradPCA, a method applying Principal Component Analysis to gradient class-means for OOD detection, and positions itself within the Low-Dimensional and Spectral Gradient Analysis leaf of the taxonomy. This leaf contains only three papers total, including the original work, indicating a relatively sparse research direction. The sibling papers explore alternative dimensionality-reduction schemes and orthogonality constraints, suggesting that spectral gradient methods remain an emerging area rather than a crowded subfield.

The taxonomy reveals that GradPCA sits within the broader Gradient-Based OOD Detection Methods branch, which encompasses five distinct leaves spanning gradient norms, spectral analysis, attribution methods, loss landscape geometry, and uncertainty estimation. Neighboring directions include Gradient Norm and Vector-Based Detection (three papers) and Gradient-Based Uncertainty and Confidence Estimation (four papers), both of which explore gradient statistics without dimensionality reduction. The taxonomy's scope and exclude notes clarify that GradPCA's spectral approach differentiates it from full-vector methods, while its inference-time focus separates it from training-regularization techniques in sibling branches.

Among thirty candidates examined, the GradPCA method contribution shows two refutable candidates from ten examined, suggesting some prior work on spectral gradient techniques exists but is not extensive. The theoretical framework contribution found no refutable candidates among ten examined, indicating potential novelty in formalizing spectral OOD detection through NTK alignment. The feature quality contribution identified three refutable candidates from ten examined, reflecting existing awareness that pretrained representations influence OOD detector performance, though the specific analysis may offer new insights within the limited search scope.

Based on the limited literature search covering thirty semantically similar candidates, GradPCA appears to occupy a moderately explored niche within spectral gradient methods. The sparse taxonomy leaf and modest refutation counts suggest incremental advancement over existing spectral approaches rather than a fundamentally new direction, though the theoretical framing and feature quality analysis may provide distinct contributions not fully captured by top-K semantic matching alone.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Out-of-distribution detection using neural network gradients. The field organizes around several major branches that reflect different ways gradients can inform OOD detection and robustness. Gradient-Based OOD Detection Methods focus on extracting discriminative signals directly from gradient statistics or low-dimensional projections, with works like GradPCA[0] and Low-dimensional Gradient[5] exploring spectral and dimensionality-reduction techniques. Gradient-Regularized Training for OOD Robustness emphasizes modifying the training objective to encourage gradient properties that generalize better, as seen in Gradient Regularized OOD[3] and Fishr[11]. Meanwhile, Gradient-Based Meta-Learning and Task Generalization leverage gradient information to adapt quickly across tasks, and Gradient-Informed Adversarial and Robustness Analysis examines how gradient behavior relates to adversarial vulnerabilities. Additional branches cover model interpretability, gradient-agnostic baselines, domain adaptation, and specialized applications, forming a taxonomy that spans detection methods, training strategies, and analytical perspectives. A particularly active line of work centers on low-dimensional and spectral gradient analysis, where researchers investigate whether projecting high-dimensional gradients onto principal subspaces or orthogonal directions can yield robust OOD scores. GradPCA[0] sits squarely in this cluster, proposing principal component analysis of gradients to capture distributional shifts. This approach contrasts with Gradorth[6], which emphasizes orthogonality constraints, and complements Low-dimensional Gradient[5], which explores alternative dimensionality-reduction schemes. Across these methods, a recurring theme is the trade-off between computational efficiency and the richness of gradient information retained. Meanwhile, works like Gradient Vectors OOD[1] and Gradient Regularized OOD[3] highlight how gradient norms or variance can serve as uncertainty proxies, raising open questions about which gradient statistics are most informative and whether spectral methods offer advantages over simpler norm-based heuristics in diverse settings.

Claimed Contributions

GradPCA method for OOD detection

Can Refute

10 retrieved papers

The authors introduce GradPCA, a novel OOD detection method that applies PCA to gradient class-means to exploit the low-dimensional subspace structure induced by NTK alignment. This is the first OOD detector to explicitly leverage NTK alignment, achieving robust performance across realistic detection scenarios.

10 retrieved papers

Can Refute

Theoretical framework for spectral OOD detection in neural networks

10 retrieved papers

The authors develop a theoretical framework extending classical and kernel PCA principles to neural networks, enabling the derivation of one-sided, per-sample OOD certificates for spectral detectors. This provides rare theoretical guarantees in the predominantly empirical OOD detection literature.

10 retrieved papers

Feature quality as critical factor for OOD detection performance

Can Refute

10 retrieved papers

The authors demonstrate that feature quality—whether representations come from pretrained versus non-pretrained models—plays a crucial role in determining which OOD detectors succeed. They show that regularity-based methods improve with pretrained features while abnormality-based methods often worsen, offering guidance for detector selection.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[5] Low-dimensional gradient helps out-of-distribution detection PDF

Yingwen Wu, Tao Li, Xinwen Cheng, Jie Yang, Xiaolin Huang (2024)

[6] Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients PDF

Behpour, Sima, Sima Behpour, Doan, Thang, Thang Doan, LI Xin, Xin Li, T. Doan, He Wen-bin, Wenbin He, Gou Liang, Liang Gou, Ren Liu, Liangke Gou, Liu Ren (2023)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

GradPCA method for OOD detection

[5] Low-dimensional gradient helps out-of-distribution detection PDF

Can Refute

[6] Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients PDF

Can Refute

[41] Bayesian Low-Rank LeArning (Bella): A Practical Approach to Bayesian Neural Networks PDF

Cannot Refute

[51] Understanding gradient descent through the training Jacobian PDF

Cannot Refute

[52] Blob: Bayesian low-rank adaptation by backpropagation for large language models PDF

Cannot Refute

[53] Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation PDF

Cannot Refute

[54] Gaussian stochastic weight averaging for Bayesian low-rank adaptation of large language models PDF

Cannot Refute

[55] SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation PDF

Cannot Refute

[56] Low-Rank Sparse Generative Adversarial Unsupervised Domain Adaptation for Multitarget Traffic Scene Semantic Segmentation PDF

Cannot Refute

[57] Bayesian Low-Rank Learning (Bella): A Practical Approach to Bayesian Deep Learning PDF

Cannot Refute

Contribution

Theoretical framework for spectral OOD detection in neural networks

[58] When and how does in-distribution label help out-of-distribution detection? PDF

Cannot Refute

[59] Multi-label out-of-distribution detection with spectral normalized joint energy PDF

Cannot Refute

[60] SpectralGap: Graph-Level Out-of-Distribution Detection via Laplacian Eigenvalue Gaps PDF

Cannot Refute

[61] Eigentrack: Spectral activation feature tracking for hallucination and out-of-distribution detection in llms and vlms PDF

Cannot Refute

[62] Transformers Don't In-Context Learn Least Squares Regression PDF

Cannot Refute

[63] Bridging ood detection and generalization: A graph-theoretic view PDF

Cannot Refute

[64] Out-of-distribution detection using union of 1-dimensional subspaces PDF

Cannot Refute

[65] PGrad: Learning Principal Gradients For Domain Generalization PDF

Cannot Refute

[66] Extrapolation and spectral bias of neural nets with hadamard product: a polynomial net study PDF

Cannot Refute

[67] Improving Calibration and Out-of-Distribution Detection in Deep Models for Medical Image Segmentation PDF

Cannot Refute

Contribution

Feature quality as critical factor for OOD detection performance

[68] Exploring the limits of out-of-distribution detection PDF

Can Refute

[70] Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution PDF

Can Refute

[76] How to train your ViT for OOD Detection PDF

Can Refute

[42] Ood-maml: Meta-learning for few-shot out-of-distribution detection and classification PDF

Cannot Refute

[69] Large language models for anomaly and out-of-distribution detection: A survey PDF

Cannot Refute

[71] Foundation models and fine-tuning: A benchmark for out of distribution detection PDF

Cannot Refute

[72] PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation PDF

Cannot Refute

[73] Your finetuned large language model is already a powerful out-of-distribution detector PDF

Cannot Refute

[74] Is fine-tuning needed? pre-trained language models are near perfect for out-of-domain detection PDF

Cannot Refute

[75] Generalize or detect? towards robust semantic segmentation under multiple distribution shifts PDF

Cannot Refute

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[5] Low-dimensional gradient helps out-of-distribution detection PDF

[6] Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients PDF

Contribution Analysis

GradPCA method for OOD detection

[5] Low-dimensional gradient helps out-of-distribution detection PDF

[6] Gradorth: A simple yet efficient out-of-distribution detection with orthogonal projection of gradients PDF

[41] Bayesian Low-Rank LeArning (Bella): A Practical Approach to Bayesian Neural Networks PDF

[51] Understanding gradient descent through the training Jacobian PDF

[52] Blob: Bayesian low-rank adaptation by backpropagation for large language models PDF

[53] Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation PDF

[54] Gaussian stochastic weight averaging for Bayesian low-rank adaptation of large language models PDF

[55] SeTAR: Out-of-Distribution Detection with Selective Low-Rank Approximation PDF

[56] Low-Rank Sparse Generative Adversarial Unsupervised Domain Adaptation for Multitarget Traffic Scene Semantic Segmentation PDF

[57] Bayesian Low-Rank Learning (Bella): A Practical Approach to Bayesian Deep Learning PDF

Theoretical framework for spectral OOD detection in neural networks

[58] When and how does in-distribution label help out-of-distribution detection? PDF

[59] Multi-label out-of-distribution detection with spectral normalized joint energy PDF

[60] SpectralGap: Graph-Level Out-of-Distribution Detection via Laplacian Eigenvalue Gaps PDF

[61] Eigentrack: Spectral activation feature tracking for hallucination and out-of-distribution detection in llms and vlms PDF

[62] Transformers Don't In-Context Learn Least Squares Regression PDF

[63] Bridging ood detection and generalization: A graph-theoretic view PDF

[64] Out-of-distribution detection using union of 1-dimensional subspaces PDF

[65] PGrad: Learning Principal Gradients For Domain Generalization PDF

[66] Extrapolation and spectral bias of neural nets with hadamard product: a polynomial net study PDF

[67] Improving Calibration and Out-of-Distribution Detection in Deep Models for Medical Image Segmentation PDF

Feature quality as critical factor for OOD detection performance

[68] Exploring the limits of out-of-distribution detection PDF

[70] Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution PDF

[76] How to train your ViT for OOD Detection PDF

[42] Ood-maml: Meta-learning for few-shot out-of-distribution detection and classification PDF

[69] Large language models for anomaly and out-of-distribution detection: A survey PDF

[71] Foundation models and fine-tuning: A benchmark for out of distribution detection PDF

[72] PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation PDF

[73] Your finetuned large language model is already a powerful out-of-distribution detector PDF

[74] Is fine-tuning needed? pre-trained language models are near perfect for out-of-domain detection PDF

[75] Generalize or detect? towards robust semantic segmentation under multiple distribution shifts PDF

Table of Contents