Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

time seriesfoundation modelsdynamical systemsforecastingchaosphysicsscientific machine learning

Recent time-series foundation models exhibit strong abilities to predict physical systems. These abilities include zero-shot forecasting, in which a model forecasts future states of a system given only a short trajectory as context, without knowledge of the underlying physics. Here, we show that foundation models often forecast through a simple parroting strategy, and when they are not parroting they exhibit some shared failure modes such as converging to the mean. As a result, a naive context parroting model that copies directly from the context scores higher than leading time-series foundation models on predicting a diverse range of dynamical systems, including low-dimensional chaos, turbulence, coupled oscillators, and electrocardiograms, at a tiny fraction of the computational cost. We draw a parallel between context parroting and induction heads, which explains recent works showing that large language models can often be repurposed for time series forecasting. Our dynamical systems perspective also ties the scaling between forecast accuracy and context length to the fractal dimension of the underlying chaotic attractor, providing insight into previously observed in-context neural scaling laws. By revealing the performance gaps and failure modes of current time-series foundation models, context parroting can guide the design of future foundation models and help identify in-context learning strategies beyond parroting.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper investigates whether time-series foundation models genuinely learn temporal dynamics or rely on simple context parroting strategies for zero-shot forecasting. It resides in the 'Capability and Limitation Analysis' leaf under 'Evaluation, Benchmarking, and Analysis', alongside four sibling papers examining reasoning abilities, memorization, and generalization gaps. This leaf is moderately populated within a taxonomy of 50 papers, indicating that capability analysis is an active but not overcrowded research direction. The work's focus on failure modes and parroting mechanisms positions it within ongoing debates about what foundation models truly learn versus what they memorize.

The taxonomy reveals several neighboring research directions. The sibling leaf 'Comprehensive Benchmarking Frameworks' proposes unified evaluation protocols, while 'Domain-Specific Evaluation Studies' examines performance on chaotic systems, industrial data, and healthcare applications. Nearby branches include 'Enhancement Mechanisms and Augmentation Strategies', which explore retrieval-augmented and reasoning-based improvements, and 'Foundation Model Architectures', which propose transformer, diffusion, and state-space designs. The paper's analytical stance contrasts with these architecture-focused and application-driven directions, instead probing the fundamental mechanisms underlying zero-shot forecasting success and failure across diverse dynamical systems.

Among 20 candidates examined across three contributions, the analysis found limited prior work overlap. The 'context parroting baseline' contribution examined zero candidates, suggesting this specific framing may be novel. The 'failure modes' contribution examined 10 candidates and identified one potentially refutable paper, indicating some existing work on model limitations but not comprehensive coverage. The 'fractal dimension scaling laws' contribution examined 10 candidates with zero refutations, suggesting this theoretical connection may be relatively unexplored. The modest search scope (20 candidates total) means these findings reflect top semantic matches rather than exhaustive field coverage.

Based on the limited search of 20 semantically similar papers, the work appears to offer fresh perspectives on foundation model mechanisms, particularly the parroting baseline and fractal dimension analysis. However, the single refutable candidate for failure mode analysis suggests some overlap with existing capability studies. The taxonomy context shows this sits in an active evaluation cluster, so broader literature may contain additional relevant work not captured in the top-20 semantic matches.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: zero-shot forecasting of dynamical systems using time-series foundation models. The field has rapidly organized around five main branches. Foundation Model Architectures and Pre-training Strategies explore diverse backbone designs—ranging from decoder-only transformers like A decoder-only foundation model[1] and Lag-llama[4] to state-space models such as A Mamba Foundation Model[17]—and the large-scale pre-training regimes that enable generalization across unseen domains. Enhancement Mechanisms and Augmentation Strategies investigate techniques like retrieval-augmented generation (Ts-rag[6]) and diffusion-based paradigms (Generative pre-trained diffusion paradigm[19]) to improve robustness and adaptability. Evaluation, Benchmarking, and Analysis provides systematic assessments through benchmarks like Tsfm-bench[7] and studies examining capability boundaries (Uncovering Zero-Shot Generalization Gaps[29], Are Time Series Foundation[21]). Domain-Specific Applications and Adaptations tailor foundation models to specialized contexts such as healthcare (Foundation Models for Clinical[47]), traffic (Zero-Shot Traffic Flow Prediction[46]), and climate (CarbonX[22]). Finally, Efficiency and Optimization Techniques address computational constraints and deployment challenges, ensuring scalability for real-world use. A particularly active line of work focuses on understanding what foundation models truly learn versus what they memorize, with studies like Measuring Memorization and Generalization[42] and Implicit Reasoning in Deep[45] probing the mechanisms behind zero-shot success. Another contrasting direction examines whether models genuinely capture temporal dynamics or rely on simpler heuristics, as explored in Only the curve shape[16] and Are Time-Series Foundation Models[48]. Context parroting[0] sits squarely within the Capability and Limitation Analysis cluster, investigating whether models parrot contextual patterns rather than learning robust forecasting principles. Its emphasis on dissecting failure modes aligns closely with Uncovering Zero-Shot Generalization Gaps[29], which systematically identifies distribution shifts that break generalization, and complements Measuring Memorization and Generalization[42], which quantifies the memorization-generalization trade-off. Together, these works highlight ongoing debates about the true reasoning capabilities of time-series foundation models and the conditions under which zero-shot forecasting remains reliable.

Claimed Contributions

Context parroting as a simple baseline for zero-shot forecasting

0 retrieved papers

The authors introduce context parroting, a naive nearest-neighbor algorithm that copies matching motifs from context data to make forecasts. This baseline outperforms leading time-series foundation models on dynamical systems while requiring minimal computational cost, revealing performance gaps in current models.

0 retrieved papers

Revealing failure modes of time-series foundation models

Can Refute

10 retrieved papers

The authors demonstrate that context parroting surpasses state-of-the-art foundation models (Chronos, TimesFM, Time-MoE, Moirai, DynaMix) on forecasting diverse dynamical systems, exposing shared failure modes such as converging to the mean and inability to fully utilize context data.

10 retrieved papers

Can Refute

Linking in-context neural scaling laws to fractal dimension

10 retrieved papers

The authors provide a theoretical explanation for the power-law relationship between forecast accuracy and context length observed in foundation models. They connect the scaling coefficient to the fractal dimension of chaotic attractors, offering geometric insights into in-context learning.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[29] Uncovering Zero-Shot Generalization Gaps in Time-Series Foundation Models Using Real-World Videos PDF

Li Lujun, Sleem, Lama, Lujun Li, Wang Yiqun, Lama Sleem, Xu Yang-Jie, Yiqun Wang, Gentile, NiccolÃ², Yangjie Xu, State, Radu, NiccolÃ² Gentile, Radu State (2025)

[42] Measuring Memorization and Generalization in Forecasting Models via Structured Perturbations of Chaotic Systems PDF

M Kanwal, C Tran (2025)

[45] Implicit Reasoning in Deep Time Series Forecasting PDF

Potosnak, Willa, Challu, Cristian, Willa Potosnak, Goswami Mononito, Cristian Challu, Mononito Goswami, MichaÅ WiliÅski, Dubrawski, Artur, Nina Zukowska (2024)

[48] Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains PDF

Zhang Jiawen, Zhang, Zhenwei, Jiawen Zhang, Zheng Shun, Zhenwei Zhang, Shun Zheng, Li Jia, Xumeng Wen, Bian Jiang, Jia Li, Jiang Bian (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Context parroting as a simple baseline for zero-shot forecasting

Contribution

Revealing failure modes of time-series foundation models

[2] True zero-shot inference of dynamical systems preserving long-term statistics PDF

Can Refute

[61] Mamba Integrated with Physics Principles Masters Long-term Chaotic System Forecasting PDF

Cannot Refute

[62] Over the edge of chaos? excess complexity as a roadblock to artificial general intelligence PDF

Cannot Refute

[63] Sparse identification of nonlinear dynamics for model predictive control in the low-data limit PDF

Cannot Refute

[64] Transient chaos in bidirectional encoder representations from transformers PDF

Cannot Refute

[65] Applying Machine Learning to Improve Simulations of a Chaotic Dynamical System Using Empirical Error Correction PDF

Cannot Refute

[66] Output error behavior for discretizations of ergodic, chaotic ODE systems PDF

Cannot Refute

[67] Chaotic dynamics of supersonic fluttering porous FG-plate on a nonlinear pasternak foundations under sub-harmonic resonance PDF

Cannot Refute

[68] Soft computing model using cluster-PCA in port model for throughput forecasting. PDF

Cannot Refute

[69] Chaos-Resilient MLOps Framework for Geospatial Intelligence Using SRE Principles and Kubeflow Pipelines PDF

Cannot Refute

Contribution

Linking in-context neural scaling laws to fractal dimension

[51] Multifractal-Aware Convolutional Attention Synergistic Network for Carbon Market Price Forecasting PDF

Cannot Refute

[52] A novel grey forecasting model with generalised fractal derivative and its optimisation PDF

Cannot Refute

[53] Hybrid LSTM-Based Fractional-Order Neural Network for Jeju Island's Wind Farm Power Forecasting PDF

Cannot Refute

[54] Stock Index Return Volatility Forecast via Excitatory and Inhibitory Neuronal Synapse Unit with Modified MF-ADCCA PDF

Cannot Refute

[55] Co-movement, fractal behaviour and forecasting of exchange rates PDF

Cannot Refute

[56] The use of NARX neural networks to predict chaotic time series PDF

Cannot Refute

[57] An user intention mining model based on fractal time series pattern PDF

Cannot Refute

[58] A novel hybrid fractal interpolation-SVM model for forecasting stock price indexes PDF

Cannot Refute

[59] Long Input Sequence Network for Long Time Series Forecasting PDF

Cannot Refute

[60] EEG-based motor imagery classification using neuro-fuzzy prediction and wavelet fractal features PDF

Cannot Refute

Context parroting: A simple but tough-to-beat baseline for foundation models in scientific machine learning

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[29] Uncovering Zero-Shot Generalization Gaps in Time-Series Foundation Models Using Real-World Videos PDF

[42] Measuring Memorization and Generalization in Forecasting Models via Structured Perturbations of Chaotic Systems PDF

[45] Implicit Reasoning in Deep Time Series Forecasting PDF

[48] Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains PDF

Contribution Analysis

Context parroting as a simple baseline for zero-shot forecasting

Revealing failure modes of time-series foundation models

[2] True zero-shot inference of dynamical systems preserving long-term statistics PDF

[61] Mamba Integrated with Physics Principles Masters Long-term Chaotic System Forecasting PDF

[62] Over the edge of chaos? excess complexity as a roadblock to artificial general intelligence PDF

[63] Sparse identification of nonlinear dynamics for model predictive control in the low-data limit PDF

[64] Transient chaos in bidirectional encoder representations from transformers PDF

[65] Applying Machine Learning to Improve Simulations of a Chaotic Dynamical System Using Empirical Error Correction PDF

[66] Output error behavior for discretizations of ergodic, chaotic ODE systems PDF

[67] Chaotic dynamics of supersonic fluttering porous FG-plate on a nonlinear pasternak foundations under sub-harmonic resonance PDF

[68] Soft computing model using cluster-PCA in port model for throughput forecasting. PDF

[69] Chaos-Resilient MLOps Framework for Geospatial Intelligence Using SRE Principles and Kubeflow Pipelines PDF

Linking in-context neural scaling laws to fractal dimension

[51] Multifractal-Aware Convolutional Attention Synergistic Network for Carbon Market Price Forecasting PDF

[52] A novel grey forecasting model with generalised fractal derivative and its optimisation PDF

[53] Hybrid LSTM-Based Fractional-Order Neural Network for Jeju Island's Wind Farm Power Forecasting PDF

[54] Stock Index Return Volatility Forecast via Excitatory and Inhibitory Neuronal Synapse Unit with Modified MF-ADCCA PDF

[55] Co-movement, fractal behaviour and forecasting of exchange rates PDF

[56] The use of NARX neural networks to predict chaotic time series PDF

[57] An user intention mining model based on fractal time series pattern PDF

[58] A novel hybrid fractal interpolation-SVM model for forecasting stock price indexes PDF

[59] Long Input Sequence Network for Long Time Series Forecasting PDF

[60] EEG-based motor imagery classification using neuro-fuzzy prediction and wavelet fractal features PDF

Table of Contents