FACT: a first-principles alternative to the Neural Feature Ansatz for how networks learn representations

ICLR 2026 Conference SubmissionAnonymous Authors
feature learningdeep learningneural feature ansatzconvergencetheory
Abstract:

It is a central challenge in deep learning to understand how neural networks learn representations. A leading approach is the Neural Feature Ansatz (NFA) (Radhakrishnan et al., 2024), a conjectured mechanism for how feature learning occurs. Although the NFA is empirically validated, it is an educated guess and lacks a theoretical basis, and thus it is unclear when it might fail, and how to improve it. In this paper, we take a first-principles approach to understanding why this observation holds, and when it does not. We use first-order optimality conditions to derive the Features at Convergence Theorem (FACT), an alternative to the NFA that (a) obtains greater agreement with learned features at convergence, (b) explains why the NFA holds in most settings, and (c) captures essential feature learning phenomena in neural networks such as grokking behavior in modular arithmetic and phase transitions in learning sparse parities, similarly to the NFA. Thus, our results unify theoretical first-order optimality analyses of neural networks with the empirically-driven NFA literature, and provide a principled alternative that provably and empirically holds at convergence.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes the Features at Convergence Theorem (FACT) as a first-principles alternative to the empirically-driven Neural Feature Ansatz (NFA), deriving feature learning mechanisms from optimization theory and convergence conditions. It resides in the 'First-Principles and Optimization-Based Analysis' leaf, which contains only two papers total within the broader theoretical foundations branch. This represents a relatively sparse research direction within the taxonomy, suggesting that rigorous optimization-theoretic approaches to feature learning remain underexplored compared to empirical or application-driven methods. The sibling paper in this leaf appears to focus on related mechanistic questions, indicating a small but coherent cluster of work examining fundamental learning dynamics.

The taxonomy reveals that theoretical feature learning research is organized into three main directions: first-principles analysis (where this work sits), dynamics and evolution of representations, and transferability studies. Neighboring branches include unsupervised learning methods and interpretability techniques, which analyze learned features post-hoc rather than modeling their formation. The scope note for this leaf explicitly excludes 'empirical observations without theoretical derivation,' positioning FACT as complementary to the larger body of empirical NFA literature. The work bridges optimization theory with the empirically-validated NFA framework, potentially connecting formal convergence analysis to observed learning phenomena like grokking and phase transitions.

Among the twenty-eight candidates examined through semantic search and citation expansion, none were identified as clearly refuting any of the three main contributions. The FACT theorem itself was evaluated against ten candidates with zero refutable matches, as was the FACT-based Recursive Feature Machine algorithm. The theoretical explanation connecting NFA to first-order optimality examined eight candidates, also with no refutations found. This limited search scope suggests that within the top semantic matches, no prior work explicitly derives convergence-based feature learning mechanisms from first-order optimality conditions in the manner proposed. However, the modest search scale means potentially relevant optimization-theoretic analyses outside this candidate set remain unexamined.

Based on the available signals, the work appears to occupy a relatively novel position within the limited scope examined, particularly in formalizing the empirical NFA through optimization theory. The sparse population of the first-principles analysis leaf and absence of refuting candidates among twenty-eight examined papers suggest limited direct prior work on this specific approach. However, the analysis covers only top semantic matches and does not constitute an exhaustive survey of optimization theory applied to neural network feature learning, leaving open the possibility of related theoretical frameworks in adjacent mathematical or machine learning literature.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
28
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Understanding how neural networks learn representations through feature learning. The field encompasses a broad spectrum of research directions, organized into several major branches. Theoretical Foundations and Mechanisms of Feature Learning investigates the underlying principles and optimization dynamics that govern how networks discover useful features, often through first-principles analysis and mathematical frameworks. Unsupervised and Self-Supervised Representation Learning explores methods like Contrastive Predictive Coding[18] and Contrastive Self-Distillation[5] that extract structure from unlabeled data. Graph and Network Representation Learning, exemplified by Network Representation Learning[2] and Graph Representation Learning[11], focuses on encoding relational structures. Interpretability and Analysis branches, including Feature Visualization Survey[9], aim to decode what networks have learned. Architectural Innovations introduce novel designs such as Neural Discrete Representation[3] and Matryoshka Learning[7], while Multi-View and Multi-Modal approaches integrate diverse data sources. Application-Driven and Specialized Learning Paradigms address domain-specific challenges and alternative training objectives. Within the theoretical landscape, a particularly active line of work examines optimization-based and mechanistic explanations of feature emergence, contrasting gradient-driven dynamics with structural inductive biases. FACT[0] situates itself in this first-principles analysis cluster, closely aligned with Feature Learning Mechanism[17], which similarly investigates the fundamental processes by which networks construct representations during training. While Feature Learning Mechanism[17] may emphasize empirical observations of learning trajectories, FACT[0] appears to adopt a more analytical stance, potentially leveraging optimization theory to explain when and why certain features emerge. This contrasts with interpretability-focused efforts like Feature Visualization Survey[9] or representation manipulation studies such as Representation Erasure[1], which analyze learned features post-hoc rather than modeling their formation. The central tension across these branches involves balancing mathematical rigor with empirical relevance, and understanding whether feature learning can be predicted from architectural and data properties alone.

Claimed Contributions

Features at Convergence Theorem (FACT)

The authors derive a first-principles relation based on first-order optimality conditions that neural networks must satisfy at convergence. This provides a theoretically grounded alternative to the empirically-observed Neural Feature Ansatz for understanding how networks learn representations.

10 retrieved papers
FACT-based Recursive Feature Machine algorithm

The authors develop a learning algorithm powered by FACT instead of NFA that reproduces key feature learning behaviors such as phase transitions in sparse parity learning and grokking in modular arithmetic, while achieving state-of-the-art performance on tabular data.

10 retrieved papers
Theoretical explanation connecting NFA to first-order optimality

The authors algebraically expand the FACT relation to show it is qualitatively similar to the NFA conjecture, providing theoretical foundation for why NFA typically holds by connecting it to provable first-order optimality conditions.

8 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Features at Convergence Theorem (FACT)

The authors derive a first-principles relation based on first-order optimality conditions that neural networks must satisfy at convergence. This provides a theoretically grounded alternative to the empirically-observed Neural Feature Ansatz for understanding how networks learn representations.

Contribution

FACT-based Recursive Feature Machine algorithm

The authors develop a learning algorithm powered by FACT instead of NFA that reproduces key feature learning behaviors such as phase transitions in sparse parity learning and grokking in modular arithmetic, while achieving state-of-the-art performance on tabular data.

Contribution

Theoretical explanation connecting NFA to first-order optimality

The authors algebraically expand the FACT relation to show it is qualitatively similar to the NFA conjecture, providing theoretical foundation for why NFA typically holds by connecting it to provable first-order optimality conditions.