Panda: A pretrained forecast model for chaotic dynamics

ICLR 2026 Conference SubmissionAnonymous Authors
chaosnonlinear dynamicsforecastingphysicsscientific machine learningdynamical systems
Abstract:

Chaotic systems are intrinsically sensitive to small errors, challenging efforts to construct predictive data-driven models of real-world dynamical systems such as fluid flows or neuronal activity. Prior efforts comprise either specialized models trained separately on individual time series, or foundation models trained on vast time series databases with little underlying dynamical structure. Motivated by dynamical systems theory, we present Panda\textit{Panda}, P\textit{P}atched A\textit{A}ttention for N\textit{N}onlinear D\textit{D}ynA\textit{A}mics. We train Panda\textit{Panda} on a novel synthetic, extensible dataset of 2×1042 \times 10^4 chaotic dynamical systems that we discover using an evolutionary algorithm. Trained purely on simulated data, Panda\textit{Panda} exhibits emergent properties: zero-shot forecasting of unseen chaotic systems preserving both short-term accuracy and long-term statistics. Despite having been trained only on low-dimensional ordinary differential equations, Panda\textit{Panda} spontaneously develops the ability to predict partial differential equations without retraining. We also demonstrate a neural scaling law for differential equations, underscoring the potential of pretrained models for probing abstract mathematical domains like nonlinear dynamics.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Panda, a transformer-based foundation model pretrained on 20,000 chaotic dynamical systems generated via evolutionary algorithms, targeting zero-shot forecasting of unseen chaotic attractors. It resides in the 'Transformer-Based Foundation Models' leaf, which contains four papers total, indicating a moderately populated but not overcrowded research direction. This leaf sits within the broader 'Foundation Models and Pretrained Architectures' branch, reflecting the field's recent shift from system-specific training toward large-scale pretraining for generalization across diverse dynamical regimes.

The taxonomy reveals neighboring leaves focused on recurrent and mixture-of-experts architectures, large language models for dynamics, and foundation model evaluation. Panda's transformer-based approach diverges from recurrent reservoir computing methods (a separate branch under 'Data-Driven Learning Approaches') and from physics-informed neural networks that embed governing equations directly. The scope note for this leaf emphasizes pretrained architectures enabling zero-shot forecasting without system-specific retraining, distinguishing it from hybrid methods that integrate analytical models or from classical nearest-neighbor techniques.

Among the 30 candidates examined, none clearly refute any of the three contributions: the evolutionary dataset generation (10 candidates, 0 refutable), the Panda model itself (10 candidates, 0 refutable), and the dynamics-informed architecture with channel attention (10 candidates, 0 refutable). This suggests that within the limited search scope, the combination of evolutionary dataset construction, transformer-based pretraining for chaotic systems, and the specific architectural choices appears relatively novel. However, the analysis is constrained to top-K semantic matches and does not constitute an exhaustive literature review.

Given the limited search scale and the moderately populated taxonomy leaf, the work appears to occupy a distinct position within transformer-based foundation models for chaotic dynamics. The absence of refutable candidates among 30 examined papers indicates potential novelty, though a broader search might reveal closer prior work in evolutionary algorithm applications or attention mechanisms for time-series forecasting. The taxonomy context suggests the paper contributes to an active but not saturated research direction.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: zero-shot forecasting of chaotic dynamical systems. The field has evolved from classical reservoir computing and fuzzy neural methods toward modern foundation models and hybrid physics-informed architectures. The taxonomy reveals seven main branches: Foundation Models and Pretrained Architectures emphasize large-scale pretraining on diverse dynamical systems to enable generalization without task-specific tuning; Hybrid and Physics-Informed Methods integrate known governing equations or conservation laws with data-driven learning; Data-Driven Learning Approaches encompass purely neural techniques ranging from recurrent networks to attention mechanisms; Classical and Non-Neural Methods preserve older reservoir computing and genetic algorithm baselines; Specialized Forecasting Challenges address extreme events, long-horizon prediction, and partial observability; Benchmarking, Evaluation, and Theoretical Analysis provide common frameworks and scaling laws; and Domain-Specific Applications target real-world systems such as climate, semiconductor lasers, and origami dynamics. Representative works like Zero-shot Chaotic Forecasting[1] and True Zero-shot Dynamics[3] illustrate the push toward genuine out-of-distribution generalization, while Hybrid Chaotic Forecasting[2] and Hybrid System Forecasting[7] show how physics constraints can stabilize long-term predictions. A particularly active line of work centers on transformer-based and state-space foundation models that learn universal representations of chaotic attractors. Panda Pretrained Forecast[0] sits squarely in this branch alongside Panda Universal Representation[4] and ChaosNexus Foundation Model[28], all aiming to pretrain on large corpora of simulated trajectories and then forecast unseen systems without retraining. This contrasts with Evolution Operator Learning[5], which focuses on learning operator mappings rather than sequence-to-sequence transformations, and with physics-informed approaches that embed known structure directly into the loss or architecture. A key trade-off across these directions is whether to rely on massive pretraining data versus incorporating domain knowledge, and whether zero-shot generalization can extend beyond the distribution of training attractors. Panda Pretrained Forecast[0] emphasizes scalable pretraining and transfer, positioning itself close to Panda Universal Representation[4] in methodology but distinct from True Zero-shot Dynamics[3], which explores stricter out-of-distribution scenarios and theoretical guarantees for chaotic regimes.

Claimed Contributions

Evolutionary algorithm for generating novel chaotic dynamical systems dataset

The authors develop an evolutionary search method that discovers approximately 20,000 novel chaotic ordinary differential equations through mutation and recombination of 129 known chaotic systems, creating a large-scale synthetic dataset for training dynamics models.

10 retrieved papers
Panda: pretrained global forecast model for nonlinear dynamics

The authors introduce Panda, a pretrained transformer-based model trained exclusively on synthetic chaotic trajectories that demonstrates zero-shot forecasting capability on unseen dynamical systems including experimental data from mechanical systems, electronic circuits, and turbulent flows.

10 retrieved papers
Dynamics-informed architecture with channel attention and kernelized embeddings

The authors design architectural components specifically motivated by dynamical systems theory, including channel attention layers to capture variable coupling, masked language modeling for temporal continuity, and patch embeddings using polynomial and Fourier features inspired by dynamic mode decomposition.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Evolutionary algorithm for generating novel chaotic dynamical systems dataset

The authors develop an evolutionary search method that discovers approximately 20,000 novel chaotic ordinary differential equations through mutation and recombination of 129 known chaotic systems, creating a large-scale synthetic dataset for training dynamics models.

Contribution

Panda: pretrained global forecast model for nonlinear dynamics

The authors introduce Panda, a pretrained transformer-based model trained exclusively on synthetic chaotic trajectories that demonstrates zero-shot forecasting capability on unseen dynamical systems including experimental data from mechanical systems, electronic circuits, and turbulent flows.

Contribution

Dynamics-informed architecture with channel attention and kernelized embeddings

The authors design architectural components specifically motivated by dynamical systems theory, including channel attention layers to capture variable coupling, masked language modeling for temporal continuity, and patch embeddings using polynomial and Fourier features inspired by dynamic mode decomposition.