Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

training dynamicsrepresentation learninglazy/rich regimeneural collapsegrokkingkernel methods

Dynamic feature transformation (the rich regime) does not always align with predictive performance (better representation), yet accuracy is often used as a proxy for richness, limiting analysis of their relationship. We propose a computationally efficient, performance-independent metric of richness grounded in the low-rank bias of rich dynamics, which recovers neural collapse as a special case. The metric is empirically more stable than existing alternatives and captures known lazy-to-rich transitions (e.g., grokking) without relying on accuracy. We further use it to examine how training factors (e.g., learning rate) relate to richness, confirming recognized assumptions and highlighting new observations (e.g., batch normalization promote rich dynamics). An eigendecomposition-based visualization is also introduced to support interpretability, together providing a diagnostic tool for studying the relationship between training factors, dynamics, and representations.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a computationally efficient metric for measuring dynamical richness in neural networks that does not rely on predictive accuracy. It sits within the Feature Learning Dynamics Metrics leaf of the taxonomy, which contains only two papers total. This is a notably sparse research direction, suggesting the problem of decoupling dynamics from performance remains underexplored. The sibling paper in this leaf also addresses disentangling rich dynamics from task outcomes, indicating a nascent but focused line of inquiry.

The taxonomy reveals that neighboring leaves pursue related but distinct goals. Complexity Quantification Methods (five papers) measure temporal or spatial complexity through entropy and information theory, while Diversity and Quality Trade-offs (two papers) balance exploration with optimization in evolutionary algorithms. The original paper bridges these areas by grounding its richness metric in low-rank bias rather than entropy or diversity maintenance, positioning it at the intersection of feature learning theory and dynamical systems characterization without direct overlap with prediction-oriented branches.

Among thirty candidates examined, none clearly refute the three core contributions. The Dynamical Low-Rank Measure examined ten candidates with zero refutable matches, as did the connection to neural collapse and the eigendecomposition visualization method. This suggests that within the limited search scope, the specific combination of low-rank bias as a richness proxy, its theoretical link to neural collapse, and the proposed visualization approach appear novel. The absence of refutations may reflect both the sparse literature in this exact niche and the limited scale of the search.

Based on the top-thirty semantic matches and taxonomy structure, the work appears to occupy a relatively unexplored corner of the field. The Feature Learning Dynamics Metrics leaf is small, and no examined candidates provide overlapping prior work for any contribution. However, the search scope is inherently limited, and a broader survey might reveal related metrics or visualizations in adjacent communities not captured here.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: measuring dynamical richness independent of predictive performance. This field addresses a fundamental tension in machine learning and complex systems: how to quantify the intrinsic complexity or diversity of a system's internal dynamics without conflating it with task-specific accuracy. The taxonomy reflects four main branches. Theoretical Frameworks for Dynamics-Performance Decoupling develop formal metrics and conceptual tools to separate dynamical properties from prediction quality, often drawing on information theory and feature learning perspectives (e.g., Decoupling Dynamical Richness[0], Disentangling Rich Dynamics[43]). Dynamical Systems Characterization focuses on classical and modern techniques for analyzing temporal evolution, attractors, and complexity measures in physical and biological systems (e.g., Quantum Reservoir Computing[10], EEG Energy Landscape[24]). Prediction Applications with Dynamical Complexity examines domains where rich dynamics coexist with forecasting tasks—such as traffic flow (YOLOv8 Traffic Flow[3], Visual Features Traffic[4]), climate (ENSO RNN LSTM[13]), and landslide prediction (CNN LSTM Landslide[9])—highlighting the interplay between model expressiveness and practical performance. Algorithmic and Computational Methods provide diversity-aware optimization and sampling strategies (Quality Diversity Without Maintenance[5], Determinantal Point Processes[21]) that maintain behavioral variety alongside objective optimization. A particularly active line of work explores quality-diversity algorithms and diversity metrics that preserve exploration in evolutionary and reinforcement learning settings, balancing novelty with performance (Dynamics Aware Quality Diversity[12], Likelihood Diverse Sampling[19]). Another contrasting theme emerges in reservoir computing and recurrent architectures, where internal state complexity is leveraged for temporal prediction (Time History Reservoir[17], Quantum Reservoir Computing[10]), yet the relationship between reservoir richness and generalization remains subtle. Decoupling Dynamical Richness[0] sits squarely within the Theoretical Frameworks branch, specifically addressing Feature Learning Dynamics Metrics. Its emphasis on decoupling aligns closely with Disentangling Rich Dynamics[43], which also seeks to isolate dynamical properties from task outcomes. Compared to diversity-focused works like Quality Diversity Without Maintenance[5] or Determinantal Point Processes[21], which operate in optimization contexts, Decoupling Dynamical Richness[0] appears more concerned with intrinsic characterization of learning trajectories, offering a complementary lens on how systems evolve internally regardless of their final predictive success.

Claimed Contributions

Dynamical Low-Rank Measure (DLR)

10 retrieved papers

The authors propose a computationally efficient metric called DLR that quantifies dynamical richness in neural networks by comparing activations before and after the last layer. This metric is grounded in the low-rank bias of rich dynamics and operates independently of predictive performance, enabling direct evaluation of training dynamics without referencing accuracy.

10 retrieved papers

Connection between DLR and neural collapse

10 retrieved papers

The authors establish that their proposed metric recovers neural collapse conditions (NC1 and NC2) as a special case when the feature kernel operator is a minimum projection operator. This theoretical connection extends the applicability of their metric beyond labeled classification tasks to more general settings.

10 retrieved papers

Eigendecomposition-based visualization method

10 retrieved papers

The authors introduce a complementary visualization technique based on eigendecomposition of the feature kernel operator that quantifies cumulative feature quality, utilization, and relative eigenvalues. This visualization method aids in interpreting the richness metric and provides insights into how features align with tasks and are utilized by the final layer.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[43] Disentangling Rich Dynamics from Feature Learning: A Framework for Independent Measurements PDF

Nam, Yoonsoo, Fonseca, Nayara, Yoonsoo Nam, Lee Seok-Hyeong, Chris Mingard, Mingard, Chris, S. Lee, Soufiane Hayou, Harzli, Ouns El, Ard A. Louis, Hayou, Soufiane, Louis, Ard A. (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Dynamical Low-Rank Measure (DLR)

[71] How connectivity structure shapes rich and lazy learning in neural circuits PDF

Cannot Refute

[72] High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model PDF

Cannot Refute

[73] Low-dimensional dynamics for working memory and time encoding PDF

Cannot Refute

[74] Unsupervised discovery of demixed, low-dimensional neural dynamics across multiple timescales through tensor component analysis PDF

Cannot Refute

[75] Nonlinear manifold learning in functional magnetic resonance imaging uncovers a lowâdimensional space of brain dynamics PDF

Cannot Refute

[76] Model Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds PDF

Cannot Refute

[77] Latent embeddings: An essential representation of brainâenvironment interactions PDF

Cannot Refute

[78] Complex harmonics reveal low-dimensional manifolds of critical brain dynamics PDF

Cannot Refute

[79] Large-scale neural dynamics in a shared low-dimensional state space reflect cognitive and attentional dynamics PDF

Cannot Refute

[80] The low-dimensional neural architecture of cognitive complexity is related to activity in medial thalamic nuclei PDF

Cannot Refute

Contribution

Connection between DLR and neural collapse

[51] On the representation collapse of sparse mixture of experts PDF

Cannot Refute

[52] The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features PDF

Cannot Refute

[53] Neural collapse vs. low-rank bias: Is deep neural collapse really optimal? PDF

Cannot Refute

[54] SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning PDF

Cannot Refute

[55] On generalization bounds for neural networks with low rank layers PDF

Cannot Refute

[56] Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal? PDF

Cannot Refute

[57] On the embedding collapse when scaling up recommendation models PDF

Cannot Refute

[58] Provable Emergence of Deep Neural Collapse and Low-Rank Bias in $L^2$ -Regularized Nonlinear Networks PDF

Cannot Refute

[59] Neural rank collapse: Weight decay and small within-class variability yield low-rank bias PDF

Cannot Refute

[60] Implicit geometry of next-token prediction: From language sparsity patterns to model representations PDF

Cannot Refute

Contribution

Eigendecomposition-based visualization method

[61] Neural eigenfunctions are structured representation learners PDF

Cannot Refute

[62] Generalized Factor Neural Network Model for High-dimensional Regression PDF

Cannot Refute

[63] An operator theoretic approach for analyzing sequence neural networks PDF

Cannot Refute

[64] A Spectral Theory of Neural Prediction and Alignment PDF

Cannot Refute

[65] Task-Specific Scene Structure Representations PDF

Cannot Refute

[66] What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? PDF

Cannot Refute

[67] Inductive Link Prediction in Heterogeneous Information Networks via Adversarial Distillation PDF

Cannot Refute

[68] Learning network representations PDF

Cannot Refute

[69] Improving Short-Term Gas Load Forecasting Accuracy: A Deep Learning Method with Dual Optimization of Dimensionality Reduction and Noise Reduction PDF

Cannot Refute

[70] Interpretable embedding procedure knowledge transfer via stacked principal component analysis and graph neural network PDF

Cannot Refute

Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[43] Disentangling Rich Dynamics from Feature Learning: A Framework for Independent Measurements PDF

Contribution Analysis

Dynamical Low-Rank Measure (DLR)

[71] How connectivity structure shapes rich and lazy learning in neural circuits PDF

[72] High-dimensional neuronal activity from low-dimensional latent dynamics: a solvable model PDF

[73] Low-dimensional dynamics for working memory and time encoding PDF

[74] Unsupervised discovery of demixed, low-dimensional neural dynamics across multiple timescales through tensor component analysis PDF

[75] Nonlinear manifold learning in functional magnetic resonance imaging uncovers a lowâdimensional space of brain dynamics PDF

[76] Model Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds PDF

[77] Latent embeddings: An essential representation of brainâenvironment interactions PDF

[78] Complex harmonics reveal low-dimensional manifolds of critical brain dynamics PDF

[79] Large-scale neural dynamics in a shared low-dimensional state space reflect cognitive and attentional dynamics PDF

[80] The low-dimensional neural architecture of cognitive complexity is related to activity in medial thalamic nuclei PDF

Connection between DLR and neural collapse

[51] On the representation collapse of sparse mixture of experts PDF

[52] The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features PDF

[53] Neural collapse vs. low-rank bias: Is deep neural collapse really optimal? PDF

[54] SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning PDF

[55] On generalization bounds for neural networks with low rank layers PDF

[56] Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal? PDF

[57] On the embedding collapse when scaling up recommendation models PDF

[58] Provable Emergence of Deep Neural Collapse and Low-Rank Bias in L2L^2L2-Regularized Nonlinear Networks PDF

[59] Neural rank collapse: Weight decay and small within-class variability yield low-rank bias PDF

[60] Implicit geometry of next-token prediction: From language sparsity patterns to model representations PDF

Eigendecomposition-based visualization method

[61] Neural eigenfunctions are structured representation learners PDF

[62] Generalized Factor Neural Network Model for High-dimensional Regression PDF

[63] An operator theoretic approach for analyzing sequence neural networks PDF

[64] A Spectral Theory of Neural Prediction and Alignment PDF

[65] Task-Specific Scene Structure Representations PDF

[66] What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? PDF

[67] Inductive Link Prediction in Heterogeneous Information Networks via Adversarial Distillation PDF

[68] Learning network representations PDF

[69] Improving Short-Term Gas Load Forecasting Accuracy: A Deep Learning Method with Dual Optimization of Dimensionality Reduction and Noise Reduction PDF

[70] Interpretable embedding procedure knowledge transfer via stacked principal component analysis and graph neural network PDF

Table of Contents

[75] Nonlinear manifold learning in functional magnetic resonance imaging uncovers a lowâdimensional space of brain dynamics PDF

[77] Latent embeddings: An essential representation of brainâenvironment interactions PDF

[58] Provable Emergence of Deep Neural Collapse and Low-Rank Bias in $L^2$ -Regularized Nonlinear Networks PDF