FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

Continual LearningLife-long LearningBrain-inspired AICatastrophic ForgettingPrompt Tuning

General continual learning (GCL) challenges intelligent systems to learn from single-pass, non-stationary data streams without clear task boundaries. While recent advances in continual parameter-efficient tuning (PET) of pretrained models show promise, they typically rely on multiple training epochs and explicit task cues, limiting their effectiveness in GCL scenarios. Moreover, existing methods often lack targeted design and fail to address two fundamental challenges in continual PET: how to allocate expert parameters to evolving data distributions, and how to improve their representational capacity under limited supervision. Inspired by the fruit fly's hierarchical memory system characterized by sparse expansion and modular ensembles, we propose FlyPrompt, a brain-inspired framework that decomposes GCL into two subproblems: expert routing and expert competence improvement. FlyPrompt introduces a randomly expanded analytic router for instance-level expert activation and a temporal ensemble of output heads to dynamically adapt decision boundaries over time. Extensive theoretical and empirical evaluations demonstrate FlyPrompt's superior performance, achieving up to 11.23%, 12.43%, and 7.62% gains over state-of-the-art baselines on CIFAR-100, ImageNet-R, and CUB-200, respectively.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces FlyPrompt, a brain-inspired framework for general continual learning that decomposes the problem into expert routing and competence improvement. It resides in the 'Brain-Inspired Routing and Ensemble Mechanisms' leaf under 'Architectural Modularity and Expert Systems.' Notably, this leaf contains only the original paper itself—no sibling papers are present. This isolation suggests FlyPrompt occupies a relatively sparse research direction within the broader taxonomy, which encompasses 50 papers across approximately 36 topics, indicating that biologically-inspired routing mechanisms remain underexplored compared to memory-based or optimization-centric approaches.

The taxonomy reveals that FlyPrompt's immediate parent category, 'Architectural Modularity and Expert Systems,' also includes 'Dynamic Expert Allocation' methods, which dynamically create specialized experts without biological inspiration. Neighboring branches such as 'Memory-Based Continual Learning Mechanisms' (with prototype-based and exemplar replay strategies) and 'Optimization and Learning Rate Strategies' represent more crowded research directions. The scope note for the original leaf explicitly excludes non-biologically-inspired modular methods, positioning FlyPrompt at the intersection of neuroscience and continual learning—a niche that distinguishes it from memory-centric approaches like Continual Prototype Evolution and optimization-focused techniques like Gradient Equilibrium.

Among 21 candidates examined, no refutable prior work was identified for any of the three contributions. The 'FlyPrompt framework for general continual learning' examined 10 candidates with zero refutations, as did 'Task-wise Experts with Temporal Ensemble (TE2).' The 'Random Expanded Analytic Router (REAR)' examined only 1 candidate, also yielding no refutation. This limited search scope—21 candidates total, not hundreds—suggests that while no overlapping prior work surfaced in top-K semantic matches, the analysis does not constitute an exhaustive literature review. The absence of refutations across all contributions indicates that, within this bounded search, the proposed mechanisms appear distinct from examined alternatives.

Given the sparse population of the brain-inspired routing leaf and the lack of refutations among 21 examined candidates, FlyPrompt appears to introduce a relatively novel architectural direction within general continual learning. However, the small search scale and the single-paper leaf status underscore that this assessment reflects limited coverage rather than comprehensive field knowledge. The framework's biological inspiration and modular routing design differentiate it from memory and optimization paradigms, but broader validation would require examining additional candidates beyond the top-K semantic neighborhood.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: General continual learning from single-pass non-stationary data streams. The field addresses how models can adapt to evolving data distributions without revisiting past examples, a challenge that arises in real-world deployments where storage and computational constraints prevent replay. The taxonomy reveals a rich landscape organized around complementary strategies: Memory-Based Continual Learning Mechanisms focus on selective retention and replay of representative samples, while Architectural Modularity and Expert Systems explore dynamic routing and ensemble approaches that allocate specialized subnetworks to different data regimes. Probabilistic and Statistical Adaptation Methods emphasize uncertainty quantification and Bayesian updates, whereas Meta-Learning and Transfer Approaches seek to learn how to learn across tasks. Optimization and Learning Rate Strategies tackle the stability-plasticity dilemma through adaptive step sizes, and Semi-Supervised and Label-Scarce Learning addresses scenarios with limited supervision. Application-Specific Continual Learning tailors methods to domains like robotics or smart cities, Distributed and Federated Continual Learning extends these ideas to decentralized settings, and Test-Time and Deployment Adaptation handles shifts encountered during inference. Theoretical Foundations and Analysis provide formal guarantees, while Survey and Benchmark Studies like Wild-time Benchmark[22] and Data Streams Overview[19] consolidate progress and establish evaluation protocols. Within this ecosystem, a particularly active line of work explores brain-inspired routing and ensemble mechanisms that dynamically select or compose experts as the stream evolves. FlyPrompt[0] exemplifies this direction by drawing on biological principles to route inputs through modular components, aiming to balance specialization and generalization without catastrophic forgetting. This approach contrasts with memory-centric methods like Continual Prototype Evolution[1] and Online Prototype Learning[17], which maintain evolving prototypes to summarize past data, and with optimization-focused strategies such as Gradient Equilibrium[35] that adjust learning dynamics to stabilize updates. FlyPrompt[0] also differs from test-time adaptation techniques like TWIN-ADAPT[14] and Test-time Adaptation Buffering[46], which defer adaptation until deployment rather than continuously updating during training. By situating itself among architectural modularity solutions, FlyPrompt[0] highlights an ongoing debate: whether continual learning is best achieved through clever memory management, adaptive optimization, or intelligent routing of information through specialized pathways.

Claimed Contributions

FlyPrompt framework for general continual learning

10 retrieved papers

The authors introduce FlyPrompt, a biologically inspired framework that addresses general continual learning by decomposing it into expert routing (assigning inputs to appropriate experts) and expert competence improvement (enhancing expert representations under limited supervision). The framework is inspired by the fruit fly's hierarchical memory system.

10 retrieved papers

Random Expanded Analytic Router (REAR)

1 retrieved paper

The authors propose REAR, a routing mechanism that uses fixed random projections and closed-form updates to assign inputs to experts without iterative training. This component mimics the sparse expansion circuits observed in fruit fly olfactory systems and enables efficient expert selection in single-pass learning scenarios.

1 retrieved paper

Task-wise Experts with Temporal Ensemble (TE2)

10 retrieved papers

The authors introduce TE2, which equips each expert with multiple exponential moving average (EMA) heads at different decay rates to capture knowledge across multiple time scales. This design mirrors the compartmental consolidation in the mushroom body and improves expert robustness under non-stationary data streams.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FlyPrompt framework for general continual learning

[52] Achieving Deep Continual Learning via Evolution PDF

Cannot Refute

[53] Dual-Memory Multi-Modal Learning for Continual Spoken Keyword Spotting with Confidence Selection and Diversity Enhancement PDF

Cannot Refute

[54] Experts Collaboration Learning for Continual Multi-Modal Reasoning PDF

Cannot Refute

[55] Choose Your Expert: Uncertainty-Guided Expert Selection for Continual Deepfake Detection PDF

Cannot Refute

[56] LLM-Guided Decoupled Probabilistic Prompt for Continual Learning in Medical Image Diagnosis PDF

Cannot Refute

[57] SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation PDF

Cannot Refute

[58] Adapt-â: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection PDF

Cannot Refute

[59] Lifelong Learning with Behavior Consolidation for Vehicle Routing PDF

Cannot Refute

[60] The design of personal mobile technologies for lifelong learning PDF

Cannot Refute

[61] Building digital competence: advancing data-driven culture in the Malaysian construction sector PDF

Cannot Refute

Contribution

Random Expanded Analytic Router (REAR)

[51] AdaptForever: Elastic and Mutual Learning for Continuous NLP Task Mastery. PDF

Cannot Refute

Contribution

Task-wise Experts with Temporal Ensemble (TE2)

[62] FedEMA: Federated Exponential Moving Averaging with Negative Entropy Regularizer in Autonomous Driving PDF

Cannot Refute

[63] Weighted ensemble models are strong continual learners PDF

Cannot Refute

[64] Improving Online Continual Learning Performance and Stability with Temporal Ensembles PDF

Cannot Refute

[65] SR-SAM: Subspace Regularization for Domain Generalization of Segment Anything Model PDF

Cannot Refute

[66] Efficient and versatile robust fine-tuning of zero-shot models PDF

Cannot Refute

[67] CLASS: Continual learning approach for speech super-resolution PDF

Cannot Refute

[68] Parameter-efficient fine-tuning for continual learning: A neural tangent kernel perspective PDF

Cannot Refute

[69] Bridging continual learning of motion and self-supervised representations PDF

Cannot Refute

[70] REMA: Reinforced Exponential Moving Average for Real-Time Anomaly Detection in Sensor Data PDF

Cannot Refute

[71] Fisher Flow: An Information-Geometric Framework for Sequential Estimation PDF

Cannot Refute

FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

FlyPrompt framework for general continual learning

[52] Achieving Deep Continual Learning via Evolution PDF

[53] Dual-Memory Multi-Modal Learning for Continual Spoken Keyword Spotting with Confidence Selection and Diversity Enhancement PDF

[54] Experts Collaboration Learning for Continual Multi-Modal Reasoning PDF

[55] Choose Your Expert: Uncertainty-Guided Expert Selection for Continual Deepfake Detection PDF

[56] LLM-Guided Decoupled Probabilistic Prompt for Continual Learning in Medical Image Diagnosis PDF

[57] SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation PDF

[58] Adapt-â: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection PDF

[59] Lifelong Learning with Behavior Consolidation for Vehicle Routing PDF

[60] The design of personal mobile technologies for lifelong learning PDF

[61] Building digital competence: advancing data-driven culture in the Malaysian construction sector PDF

Random Expanded Analytic Router (REAR)

[51] AdaptForever: Elastic and Mutual Learning for Continuous NLP Task Mastery. PDF

Task-wise Experts with Temporal Ensemble (TE2)

[62] FedEMA: Federated Exponential Moving Averaging with Negative Entropy Regularizer in Autonomous Driving PDF

[63] Weighted ensemble models are strong continual learners PDF

[64] Improving Online Continual Learning Performance and Stability with Temporal Ensembles PDF

[65] SR-SAM: Subspace Regularization for Domain Generalization of Segment Anything Model PDF

[66] Efficient and versatile robust fine-tuning of zero-shot models PDF

[67] CLASS: Continual learning approach for speech super-resolution PDF

[68] Parameter-efficient fine-tuning for continual learning: A neural tangent kernel perspective PDF

[69] Bridging continual learning of motion and self-supervised representations PDF

[70] REMA: Reinforced Exponential Moving Average for Real-Time Anomaly Detection in Sensor Data PDF

[71] Fisher Flow: An Information-Geometric Framework for Sequential Estimation PDF

Table of Contents

[58] Adapt-â: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection PDF