FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability–Plasticity Tradeoff

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

stability-plasticity tradeoffcontinual learning

Deep neural networks trained on nonstationary data must balance stability (i.e., retaining prior knowledge) and plasticity (i.e., adapting to new tasks). Standard reinitialization methods, which reinitialize weights toward their original values, are widely used but difficult to tune: conservative reinitializations fail to restore plasticity, while aggressive ones erase useful knowledge. We propose FIRE, a principled reinitialization method that explicitly balances the stability–plasticity tradeoff. FIRE quantifies stability through Squared Frobenius Error (SFE), measuring proximity to past weights, and plasticity through Deviation from Isometry (DfI), reflecting weight isotropy. The reinitialization point is obtained by solving a constrained optimization problem, minimizing SFE subject to DfI being zero, which is efficiently approximated by Newton–Schulz iteration. FIRE is evaluated on continual visual learning (CIFAR-10 with ResNet-18), language modeling (OpenWebText with GPT-0.1B), and reinforcement learning (HumanoidBench with SAC and Atari games with DQN). Across all domains, FIRE consistently outperforms both naive training without intervention and standard reinitialization methods, demonstrating effective balancing of the stability–plasticity tradeoff.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes FIRE, a reinitialization method that balances stability and plasticity in continual learning by solving a constrained optimization problem. It sits within the Dual-Objective Optimization Methods leaf, which contains five papers including the original work. This leaf is part of the broader Algorithmic Approaches branch, indicating a moderately populated research direction focused on explicit multi-objective formulations. The taxonomy shows this is an active area with multiple competing approaches to balancing the stability-plasticity tradeoff.

The taxonomy reveals that Dual-Objective Optimization Methods neighbors several related algorithmic branches: Gradient Space Manipulation (five papers), Representation and Feature Adaptation (three papers), and Dynamic Training Strategies (two papers). These neighboring leaves explore alternative mechanisms for managing the tradeoff—gradient projection, feature diversity, and adaptive learning rates respectively. FIRE's constrained optimization formulation distinguishes it from gradient-based methods while sharing the dual-objective philosophy. The broader Algorithmic Approaches branch contains six subcategories, suggesting diverse technical strategies coexist in this space.

Among thirty candidates examined, the FIRE reinitialization method shows overlap with two prior works, while the Deviation from Isometry plasticity measure overlaps with one. The theoretical connection between Squared Frobenius Error and feature similarity appears more novel, with zero refutable candidates among ten examined. This suggests the core algorithmic contribution faces more substantial prior work than the theoretical analysis. However, the limited search scope means these findings reflect top-thirty semantic matches rather than exhaustive coverage of the continual learning literature.

Based on the constrained search, FIRE appears to occupy a moderately crowded research direction with established dual-objective methods. The theoretical contribution around SFE-feature similarity shows stronger novelty signals within the examined candidates. The analysis covers semantic neighbors and citation-expanded papers but does not claim comprehensive field coverage, leaving open the possibility of additional relevant work beyond the top-thirty matches.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: balancing stability and plasticity in continual learning. The field addresses how learning systems can acquire new knowledge (plasticity) without catastrophically forgetting previously learned information (stability). The taxonomy reveals several major branches: Theoretical Foundations and Analysis explores the fundamental principles underlying this trade-off, often drawing on neuroscience-inspired models of synaptic memory[13] and structural features of learning dynamics[12]. Algorithmic Approaches to Stability-Plasticity Balance encompasses a diverse set of methods, including dual-objective optimization techniques that explicitly manage competing goals, regularization-based strategies, and parameter isolation schemes such as Branch-Tuning[3] and PromptFusion[6]. Catastrophic Forgetting Prevention focuses on mitigating memory loss through replay mechanisms like Flashbacks to Harmonize Stability[2], while Plasticity Loss and Recovery addresses the complementary challenge of maintaining learning capacity over time[1][11]. Knowledge Transfer and Forward Learning, Meta-Learning for Continual Learning, and specialized domains round out the landscape, alongside surveys that synthesize these perspectives[26]. Recent work has intensified around dual-objective optimization methods, which treat stability and plasticity as explicit, often competing objectives to be balanced during training. FIRE[0] sits squarely within this branch, proposing a framework that dynamically adjusts the trade-off between retaining old knowledge and integrating new information. This approach contrasts with nearby methods like Primal dual continual learning[37], which frames the problem through constrained optimization, and Balancing the Stability-Plasticity Dilemma[40], which explores alternative weighting schemes. A central open question across these works is how to adaptively tune the balance: some methods rely on fixed hyperparameters, while others like FIRE[0] and Embracing Plasticity[5] explore dynamic or task-dependent adjustments. The tension between preventing catastrophic forgetting and avoiding plasticity loss remains a key challenge, with dual-objective methods offering a principled lens for navigating this landscape.

Claimed Contributions

FIRE reinitialization method for stability-plasticity tradeoff

Can Refute

10 retrieved papers

The authors introduce FIRE (Frobenius-Isometry REinitialization), a method that frames reinitialization as a constrained optimization problem. It minimizes Squared Frobenius Error (SFE) to preserve stability while constraining Deviation from Isometry (DfI) to zero to restore plasticity, efficiently approximated using Newton-Schulz iteration.

10 retrieved papers

Can Refute

Deviation from Isometry (DfI) as a plasticity measure

Can Refute

10 retrieved papers

The authors propose DfI as a differentiable, data-independent metric for plasticity that simultaneously addresses loss curvature, neuron dormancy, and feature rank. They provide theoretical proofs showing that minimizing DfI improves these plasticity-related properties.

10 retrieved papers

Can Refute

Theoretical connection between SFE and feature representation similarity

10 retrieved papers

The authors prove in Theorem 1 that SFE provides an upper bound on the discrepancy between normalized feature covariances of two neural networks, establishing that minimizing SFE preserves feature similarity and justifies its use as a stability metric.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] Flashbacks to Harmonize Stability and Plasticity in Continual Learning PDF

Moghadam, Peyman, Hayat, Munawar, Simon Christian, Harandi, Mehrtash (2025)

[6] PromptFusion: Decoupling Stability and Plasticity for Continual Learning PDF

Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang (2023)

[37] Primal dual continual learning: Balancing stability and plasticity through adaptive memory allocation PDF

Elenter, Juan, Juan Elenter, Naderializadeh, Navid, Navid Naderializadeh, Javidi, Tara, Tara Javidi, Ribeiro, Alejandro, Alejandro Ribeiro (2023)

[40] Balancing the Stability-Plasticity Dilemma with Online Stability Tuning for Continual Learning PDF

Anton Lee, Heitor Murilo Gomes, Yaqian Zhang (2022)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FIRE reinitialization method for stability-plasticity tradeoff

[1] Loss of plasticity in deep continual learning PDF

Can Refute

[11] Maintaining Plasticity in Deep Continual Learning PDF

Can Refute

[51] Self-normalized resets for plasticity in continual learning PDF

Cannot Refute

[52] Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks PDF

Cannot Refute

[53] Reinitializing weights vs units for maintaining plasticity in neural networks PDF

Cannot Refute

[54] Physics-informed neural networks for solving moving interface flow problems using the level set approach PDF

Cannot Refute

[55] Sample-efficient LLM Optimization with Reset Replay PDF

Cannot Refute

[56] Stability of dynamics and memory in the balanced state PDF

Cannot Refute

[57] Non-stationary learning of neural networks with automatic soft parameter reset PDF

Cannot Refute

[58] Coordinated reset stimulation of plastic neural networks with spatially dependent synaptic connections PDF

Cannot Refute

Contribution

Deviation from Isometry (DfI) as a plasticity measure

[70] Parseval regularization for continual reinforcement learning PDF

Can Refute

[34] Achieving Plasticity-Stability Trade-off in Continual Learning Through Adaptive Orthogonal Projection PDF

Cannot Refute

[69] Task-aware orthogonal sparse network for exploring shared knowledge in continual learning PDF

Cannot Refute

[71] Low tensor rank learning of neural dynamics PDF

Cannot Refute

[72] Optimal routing to cerebellum-like structures PDF

Cannot Refute

[73] Oja's plasticity rule overcomes challenges of training neural networks under biological constraints PDF

Cannot Refute

[74] Neural networks and statistical learning PDF

Cannot Refute

[75] When, where, and how to add new neurons to ANNs PDF

Cannot Refute

[76] Comparison of Input-Data Matrix Representations Used for Continual Learning with Orthogonal Weight Modification on Edge Devices PDF

Cannot Refute

[77] Explorations in echo state networks PDF

Cannot Refute

Contribution

Theoretical connection between SFE and feature representation similarity

[59] Similarity of neural network models: A survey of functional and representational measures PDF

Cannot Refute

[60] Does Representation Similarity Capture Function Similarity? PDF

Cannot Refute

[61] FroSSL: Frobenius Norm Minimization for Self-Supervised Learning PDF

Cannot Refute

[62] Deep embedding clustering based on contractive autoencoder PDF

Cannot Refute

[63] End-to-end adversarial memory network for cross-domain sentiment classification. PDF

Cannot Refute

[64] Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting PDF

Cannot Refute

[65] Lifted Residual Score Estimation PDF

Cannot Refute

[66] Position: Algebra Unveils Deep Learning-An Invitation to Neuroalgebraic Geometry PDF

Cannot Refute

[67] Emphasizing Similar Feature Representations to Defend Against PDF

Cannot Refute

[68] Emphasizing Similar Feature Representations to Defend Against Adversarial Attacks PDF

Cannot Refute

FIRE: Frobenius-Isometry Reinitialization for Balancing the Stability–Plasticity Tradeoff

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] Flashbacks to Harmonize Stability and Plasticity in Continual Learning PDF

[6] PromptFusion: Decoupling Stability and Plasticity for Continual Learning PDF

[37] Primal dual continual learning: Balancing stability and plasticity through adaptive memory allocation PDF

[40] Balancing the Stability-Plasticity Dilemma with Online Stability Tuning for Continual Learning PDF

Contribution Analysis

FIRE reinitialization method for stability-plasticity tradeoff

[1] Loss of plasticity in deep continual learning PDF

[11] Maintaining Plasticity in Deep Continual Learning PDF

[51] Self-normalized resets for plasticity in continual learning PDF

[52] Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks PDF

[53] Reinitializing weights vs units for maintaining plasticity in neural networks PDF

[54] Physics-informed neural networks for solving moving interface flow problems using the level set approach PDF

[55] Sample-efficient LLM Optimization with Reset Replay PDF

[56] Stability of dynamics and memory in the balanced state PDF

[57] Non-stationary learning of neural networks with automatic soft parameter reset PDF

[58] Coordinated reset stimulation of plastic neural networks with spatially dependent synaptic connections PDF

Deviation from Isometry (DfI) as a plasticity measure

[70] Parseval regularization for continual reinforcement learning PDF

[34] Achieving Plasticity-Stability Trade-off in Continual Learning Through Adaptive Orthogonal Projection PDF

[69] Task-aware orthogonal sparse network for exploring shared knowledge in continual learning PDF

[71] Low tensor rank learning of neural dynamics PDF

[72] Optimal routing to cerebellum-like structures PDF

[73] Oja's plasticity rule overcomes challenges of training neural networks under biological constraints PDF

[74] Neural networks and statistical learning PDF

[75] When, where, and how to add new neurons to ANNs PDF

[76] Comparison of Input-Data Matrix Representations Used for Continual Learning with Orthogonal Weight Modification on Edge Devices PDF

[77] Explorations in echo state networks PDF

Theoretical connection between SFE and feature representation similarity

[59] Similarity of neural network models: A survey of functional and representational measures PDF

[60] Does Representation Similarity Capture Function Similarity? PDF

[61] FroSSL: Frobenius Norm Minimization for Self-Supervised Learning PDF

[62] Deep embedding clustering based on contractive autoencoder PDF

[63] End-to-end adversarial memory network for cross-domain sentiment classification. PDF

[64] Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting PDF

[65] Lifted Residual Score Estimation PDF

[66] Position: Algebra Unveils Deep Learning-An Invitation to Neuroalgebraic Geometry PDF

[67] Emphasizing Similar Feature Representations to Defend Against PDF

[68] Emphasizing Similar Feature Representations to Defend Against Adversarial Attacks PDF

Table of Contents