Intrinsic training dynamics of deep neural networks
Overview
Overall Novelty Assessment
The paper investigates when gradient flow on network parameters induces an intrinsic gradient flow on a lifted variable, introducing a criterion based on conservation laws and kernel inclusions. It resides in the 'Intrinsic Dynamics and Conservation Laws' leaf, which contains only this single paper within the 50-paper taxonomy. This isolation suggests the specific focus on intrinsic dynamics via conservation laws and lifted parameterizations is relatively unexplored in the surveyed literature, occupying a sparse niche within the broader Training Dynamics and Trajectory Analysis branch.
The taxonomy reveals that neighboring leaves address related but distinct aspects of training dynamics. The 'Neural Tangent Kernel and Linearization Regimes' category examines lazy training where features remain fixed, while 'Feature Learning and Adaptive Regimes' studies nonlinear feature evolution. The 'Stability and Dynamical Properties' leaf characterizes perturbation effects and stable minima. This paper's focus on conservation laws and intrinsic geometric structure bridges these areas by providing a framework to understand when parameter-space dynamics can be rewritten in lower-dimensional lifted coordinates, complementing but not directly overlapping with kernel-regime or feature-learning analyses.
Among fourteen candidates examined, no contribution was clearly refuted by prior work. The first contribution on intrinsic dynamic properties examined one candidate with no refutation. The second contribution on ReLU networks via path-lifting examined three candidates, none refuting. The third contribution on relaxed balanced initializations for linear networks examined ten candidates, again with no refutations. This limited search scope—fourteen papers total—suggests the analysis captures closely related work but cannot claim exhaustive coverage of all potentially relevant prior art in implicit bias or conservation-law frameworks.
Based on the top-fourteen semantic matches and the taxonomy structure, the work appears to occupy a genuinely sparse research direction. The absence of sibling papers in its taxonomy leaf and the lack of refuting candidates among examined literature suggest novelty in its specific theoretical framework. However, the limited search scale means broader connections to implicit regularization, geometric optimization, or dynamical systems perspectives outside the examined set remain uncertain.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce the intrinsic dynamic property (Definition 2.6) and establish its relationship to conservation laws. They provide a simple criterion based on kernel inclusion of linear maps (Theorem 2.14) that yields a necessary condition for this property to hold, connecting gradient flows in parameter space to intrinsic flows in lifted variable space.
The authors prove that for general ReLU networks of arbitrary depth using path-lifting reparametrization, the gradient flow can be rewritten as an intrinsic dynamic for a dense set of initializations (Theorem 3.1 and Corollary 3.2). This extends previous results limited to two-layer networks to arbitrary DAG architectures.
The authors introduce relaxed balanced initializations (Definition 3.4) as a generalization of balanced conditions for linear networks. They prove these initializations satisfy the intrinsic metric property (Theorem 3.6, Theorem 3.9) and show that in certain configurations, these are necessary and sufficient conditions (Theorem 3.7). They also provide explicit intrinsic dynamics for linear neural ODEs under these conditions (Theorem 3.11).
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
Intrinsic dynamic property and its characterization via conservation laws
The authors introduce the intrinsic dynamic property (Definition 2.6) and establish its relationship to conservation laws. They provide a simple criterion based on kernel inclusion of linear maps (Theorem 2.14) that yields a necessary condition for this property to hold, connecting gradient flows in parameter space to intrinsic flows in lifted variable space.
[51] A unifying approach to self-organizing systems interacting via conservation laws PDF
Intrinsic dynamics for general ReLU networks via path-lifting
The authors prove that for general ReLU networks of arbitrary depth using path-lifting reparametrization, the gradient flow can be rewritten as an intrinsic dynamic for a dense set of initializations (Theorem 3.1 and Corollary 3.2). This extends previous results limited to two-layer networks to arbitrary DAG architectures.
[60] Harnessing symmetries for modern deep learning challenges: a path-lifting perspective PDF
[61] Convexity in ReLU Neural Networks: Beyond ICNNs? PDF
[62] Rethinking Firm Behavior When Financial Markets Are Incomplete: A General Equilibrium Model Enhanced by Artificial PDF
Relaxed balanced initializations for linear networks
The authors introduce relaxed balanced initializations (Definition 3.4) as a generalization of balanced conditions for linear networks. They prove these initializations satisfy the intrinsic metric property (Theorem 3.6, Theorem 3.9) and show that in certain configurations, these are necessary and sufficient conditions (Theorem 3.7). They also provide explicit intrinsic dynamics for linear neural ODEs under these conditions (Theorem 3.11).