IDER: IDEMPOTENT EXPERIENCE REPLAY FOR RELIABLE CONTINUAL LEARNING

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

continual learningreliableidempotence

Catastrophic forgetting, the tendency of neural networks to forget previously learned knowledge when learning new tasks, has been a major challenge in continual learning (CL). To tackle this challenge, CL methods have been proposed and shown to reduce forgetting. Furthermore, CL models deployed in mission-critical settings can benefit from uncertainty awareness by calibrating their predictions to reliably assess their confidences. However, existing uncertainty-aware continual learning methods suffer from high computational overhead and incompatibility with mainstream replay methods. To address this, we propose idempotent experience replay (IDER), a novel approach based on the idempotent property where repeated function applications yield the same output. Specifically, we first adapt the training loss to make model idempotent on current data streams. In addition, we introduce an idempotence distillation loss. We feed the output of the current model back into the old checkpoint and then minimize the distance between this reprocessed output and the original output of the current model. This yields a simple and effective new baseline for building reliable continual learners, which can be seamlessly integrated with other CL approaches. Extensive experiments on different CL benchmarks demonstrate that IDER consistently improves prediction reliability while simultaneously boosting accuracy and reducing forgetting. Our results suggest the potential of idempotence as a promising principle for deploying efficient and trustworthy continual learning systems in real-world applications. Our code will be released upon publication.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes Idempotent Experience Replay (IDER), a framework that enforces idempotence—where repeated model applications yield consistent outputs—to achieve calibrated predictions in continual learning. It resides in the 'Calibration and Reliability Enhancement' leaf under 'Advanced Continual Learning Paradigms,' which contains only two papers total. This sparse leaf focuses on prediction calibration and confidence alignment beyond accuracy optimization, distinguishing it from the more crowded Bayesian and replay-based branches. The limited sibling count suggests this calibration-centric perspective is an emerging rather than saturated research direction.

The taxonomy tree reveals that most uncertainty-aware continual learning work clusters in Bayesian methods (seven papers across three leaves) and replay-based approaches (seven papers across three leaves). IDER's calibration focus diverges from these mainstream directions: Bayesian methods like Variational Continual Learning prioritize posterior approximation, while replay methods like Uncertainty Reservoir Sampling emphasize sample selection. The 'Uncertainty-Aware Regularization and Distillation' branch (five papers) shares IDER's distillation component but lacks explicit calibration objectives. IDER bridges replay mechanisms with calibration goals, occupying a relatively underexplored intersection in the field structure.

Among thirty candidates examined, none clearly refute any of IDER's three contributions: the overall framework, the standard idempotent module, or the idempotence distillation module. Each contribution was assessed against ten candidates with zero refutable overlaps identified. This suggests that within the limited search scope, the idempotence-based approach to calibration appears distinct from existing replay and distillation methods. The sibling paper on distance-aware temperature scaling addresses calibration through different mechanisms, reinforcing that idempotence as a design principle has not been extensively explored in this context.

Based on the top-thirty semantic matches and taxonomy structure, IDER appears to introduce a novel angle within calibration-focused continual learning. The analysis covers mainstream replay and Bayesian methods but may not capture all distillation variants or recent calibration techniques outside the search scope. The sparse calibration leaf and absence of refutable prior work suggest meaningful novelty, though the limited search scale means comprehensive field coverage cannot be guaranteed.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: uncertainty-aware continual learning with catastrophic forgetting mitigation. The field addresses how neural networks can learn sequentially from new tasks without catastrophically forgetting previously acquired knowledge, while explicitly modeling and leveraging uncertainty to guide this process. The taxonomy reveals several complementary research directions: Bayesian and Probabilistic Approaches develop principled frameworks for uncertainty quantification through methods like variational inference and neural processes (e.g., Bayesian Continual Learning[1], Neural Processes Continual[20]); Uncertainty-Guided Replay and Memory Management uses uncertainty estimates to select which examples to store or replay (e.g., Uncertainty Reservoir Sampling[25], Uncertainty Dark Replay[26]); Uncertainty-Aware Regularization and Distillation incorporates uncertainty into parameter protection and knowledge transfer mechanisms (e.g., Uncertainty Distillation[5], Adaptive Uncertainty Regularization[9]); Domain-Specific Applications demonstrate these techniques in specialized contexts from medical imaging to robotics; Advanced Continual Learning Paradigms explore meta-learning, federated settings, and calibration challenges; and Theoretical Foundations provide surveys and formal analyses of forgetting phenomena (e.g., Catastrophic Forgetting Survey[15]). A particularly active tension exists between computationally expensive Bayesian methods that provide rigorous uncertainty estimates versus lightweight regularization approaches that approximate uncertainty for practical deployment. Recent work increasingly focuses on calibration and reliability—ensuring that uncertainty estimates themselves remain trustworthy as models encounter new tasks. IDER[0] sits within this calibration-focused branch alongside Distance-Aware Temperature Scaling[41], addressing how prediction confidence degrades during continual learning. While Predictive Uncertainty Forgetting[2] examines how uncertainty estimates themselves are forgotten across tasks, IDER[0] emphasizes maintaining well-calibrated predictions through task sequences. This contrasts with replay-based methods like Adaptive Prototype Replay[17] that prioritize sample selection, and with purely Bayesian approaches like Bayesian Neural Networks[12] that focus on posterior approximation rather than explicit calibration mechanisms. The calibration perspective represents a growing recognition that uncertainty awareness must extend beyond forgetting mitigation to ensure reliable decision-making throughout the continual learning process.

Claimed Contributions

Idempotent Experience Replay (IDER) framework for continual learning

10 retrieved papers

The authors introduce IDER, a new continual learning method that enforces the idempotent property (f(f(x)) = f(x)) to mitigate catastrophic forgetting and improve prediction reliability. The approach adapts the training loss to make the model idempotent on current data streams and introduces an idempotence distillation loss that feeds the current model's output back into the old checkpoint.

10 retrieved papers

Standard Idempotent Module for training on current task data

10 retrieved papers

The authors propose a training module that minimizes a loss consisting of two cross-entropy terms to train the model to be idempotent with respect to the second input argument, using either ground-truth labels or a neutral empty signal. This ensures the model maps data to a stable manifold.

10 retrieved papers

Idempotence Distillation Module for knowledge preservation

10 retrieved papers

The authors develop a distillation mechanism that enforces idempotence between the current model and the previous task checkpoint by minimizing the distance between the current model's prediction and the reprocessed output through the old model. This prevents distribution drift and mitigates recency bias without additional parameters.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[41] DATS: Distance-Aware Temperature Scaling for Calibrated Class-Incremental Learning PDF

Serra Giuseppe, Buettner, Florian, Giuseppe Serra, Florian Buettner (2025) • arXiv.org

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Idempotent Experience Replay (IDER) framework for continual learning

[61] Continual lifelong learning in neural systems: overcoming catastrophic forgetting and transferring knowledge for future learning PDF

Cannot Refute

[62] Better rates for random task orderings in continual linear models PDF

Cannot Refute

[63] Integrating Changing Data for Advanced Analytics Within Real-Time ETL and Machine Learning Frameworks: Merging ETL with Predictive Analytics PDF

Cannot Refute

[64] IT: Idempotent Test-Time Training PDF

Cannot Refute

[65] The Convergence of Real-Time ETL and Machine Learning for Predictive Analytics on Dynamic Data PDF

Cannot Refute

[66] A Category-Theoretic Framework for Wake-Sleep Consolidation in Dual-Transformer Architectures PDF

Cannot Refute

[67] Orthogonal Decoupled Continual Dictionary Learning for Multimode Process Monitoring PDF

Cannot Refute

[68] On Continual Learning using Deep Linear Networks PDF

Cannot Refute

[69] Evolving granular neural networks from fuzzy data streams PDF

Cannot Refute

[70] Applied LLaMA: Systems, Methods, and Implementations PDF

Cannot Refute

Contribution

Standard Idempotent Module for training on current task data

[51] Encoder based lifelong learning PDF

Cannot Refute

[52] A unified gradient-based framework for task-agnostic continual learning-unlearning PDF

Cannot Refute

[53] Continual action quality assessment via adaptive manifold-aligned graph regularization PDF

Cannot Refute

[54] Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation PDF

Cannot Refute

[55] Lie-Consolidation: A Geometric WakeâSleep Framework for Continual Learning on Lie Manifolds PDF

Cannot Refute

[56] Scalable and efficient continual learning from demonstration via a hypernetwork-generated stable dynamics model PDF

Cannot Refute

[57] Stable continual learning through structured multiscale plasticity manifolds PDF

Cannot Refute

[58] Neural manifold modulated continual reinforcement learning for musculoskeletal robots PDF

Cannot Refute

[59] Salience of low-frequency entrainment to visual signal for classification points to predictive processing in sign language. In proceedings of 30th annual computational â¦ PDF

Cannot Refute

[60] The Geometry of Abstraction: Continual Learning via Recursive Quotienting PDF

Cannot Refute

Contribution

Idempotence Distillation Module for knowledge preservation

[71] Knowledge fusion distillation and gradient-based data distillation for class-incremental learning PDF

Cannot Refute

[72] Maintaining fairness in logit-based knowledge distillation for class-incremental learning PDF

Cannot Refute

[73] Centroid Distance Distillation for Effective Rehearsal in Continual Learning PDF

Cannot Refute

[74] Dual-consistency model inversion for non-exemplar class incremental learning PDF

Cannot Refute

[75] Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning PDF

Cannot Refute

[76] Category adaptation meets projected distillation in generalized continual category discovery PDF

Cannot Refute

[77] Overcoming Domain Drift in Online Continual Learning PDF

Cannot Refute

[78] Test-Time Distillation for Continual Model Adaptation PDF

Cannot Refute

[79] Unleashing the power of continual learning on non-centralized devices: A survey PDF

Cannot Refute

[80] ICICLE: Interpretable Class Incremental Continual Learning PDF

Cannot Refute

IDER: IDEMPOTENT EXPERIENCE REPLAY FOR RELIABLE CONTINUAL LEARNING

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[41] DATS: Distance-Aware Temperature Scaling for Calibrated Class-Incremental Learning PDF

Contribution Analysis

Idempotent Experience Replay (IDER) framework for continual learning

[61] Continual lifelong learning in neural systems: overcoming catastrophic forgetting and transferring knowledge for future learning PDF

[62] Better rates for random task orderings in continual linear models PDF

[63] Integrating Changing Data for Advanced Analytics Within Real-Time ETL and Machine Learning Frameworks: Merging ETL with Predictive Analytics PDF

[64] IT: Idempotent Test-Time Training PDF

[65] The Convergence of Real-Time ETL and Machine Learning for Predictive Analytics on Dynamic Data PDF

[66] A Category-Theoretic Framework for Wake-Sleep Consolidation in Dual-Transformer Architectures PDF

[67] Orthogonal Decoupled Continual Dictionary Learning for Multimode Process Monitoring PDF

[68] On Continual Learning using Deep Linear Networks PDF

[69] Evolving granular neural networks from fuzzy data streams PDF

[70] Applied LLaMA: Systems, Methods, and Implementations PDF

Standard Idempotent Module for training on current task data

[51] Encoder based lifelong learning PDF

[52] A unified gradient-based framework for task-agnostic continual learning-unlearning PDF

[53] Continual action quality assessment via adaptive manifold-aligned graph regularization PDF

[54] Continuous Intermediate Token Learning with Implicit Motion Manifold for Keyframe Based Motion Interpolation PDF

[55] Lie-Consolidation: A Geometric WakeâSleep Framework for Continual Learning on Lie Manifolds PDF

[56] Scalable and efficient continual learning from demonstration via a hypernetwork-generated stable dynamics model PDF

[57] Stable continual learning through structured multiscale plasticity manifolds PDF

[58] Neural manifold modulated continual reinforcement learning for musculoskeletal robots PDF

[59] Salience of low-frequency entrainment to visual signal for classification points to predictive processing in sign language. In proceedings of 30th annual computational â¦ PDF

[60] The Geometry of Abstraction: Continual Learning via Recursive Quotienting PDF

Idempotence Distillation Module for knowledge preservation

[71] Knowledge fusion distillation and gradient-based data distillation for class-incremental learning PDF

[72] Maintaining fairness in logit-based knowledge distillation for class-incremental learning PDF

[73] Centroid Distance Distillation for Effective Rehearsal in Continual Learning PDF

[74] Dual-consistency model inversion for non-exemplar class incremental learning PDF

[75] Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning PDF

[76] Category adaptation meets projected distillation in generalized continual category discovery PDF

[77] Overcoming Domain Drift in Online Continual Learning PDF

[78] Test-Time Distillation for Continual Model Adaptation PDF

[79] Unleashing the power of continual learning on non-centralized devices: A survey PDF

[80] ICICLE: Interpretable Class Incremental Continual Learning PDF

Table of Contents

[55] Lie-Consolidation: A Geometric WakeâSleep Framework for Continual Learning on Lie Manifolds PDF

[59] Salience of low-frequency entrainment to visual signal for classification points to predictive processing in sign language. In proceedings of 30th annual computational â¦ PDF