CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Time Series Foundation ModelTime Series Classification

Time series foundation models (TSFMs) have recently gained significant attention due to their strong zero-shot capabilities and widespread real-world applications. Such models typically require a computationally costly pretraining on large-scale, carefully curated collections of real-world sequences. To allow for a sample-efficient pretraining of TSFMs, we propose CauKer, a novel algorithm designed to generate diverse, causally coherent synthetic time series with realistic trends, seasonality, and nonlinear interactions. CauKer combines Gaussian Process (GP) kernel composition with Structural Causal Models (SCM) to produce data for sample-efficient pretraining of state-of-the-art classification TSFMs having different architectures and following different pretraining approaches. Additionally, our experiments reveal that CauKer-generated datasets exhibit clear scaling laws for both dataset size (10K to 10M samples) and model capacity (1M to 783M parameters), unlike real-world datasets, which display irregular scaling behavior.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes CauKer, a synthetic data generation pipeline combining Gaussian Process kernel composition with Structural Causal Models to produce causally coherent time series for pretraining classification foundation models. It resides in the 'Causal and Structural Approaches' leaf under 'Synthetic Data Generation Methods', which contains only two papers total. This sparse population suggests the specific combination of causal modeling and kernel composition for TSFM pretraining is relatively underexplored compared to the eight papers in the neighboring 'Deep Generative Models' leaf, indicating a less crowded research direction.

The taxonomy reveals that most synthetic generation work clusters around deep generative methods (GANs, VAEs, diffusion models) or symbolic pairing for multimodal learning, while causal and structural approaches remain a minority. The paper's focus on explicit causal coherence and GP kernels distinguishes it from purely latent-based generators and from symbolic methods that pair data with textual annotations. Neighboring branches like 'Pretraining Strategies' (e.g., Chronos, Lag-Llama) demonstrate large-scale pretraining on diverse synthetic corpora but typically do not emphasize causal structure in data generation, highlighting a methodological divergence.

Among thirty candidates examined, none clearly refuted any of the three contributions: the CauKer pipeline (ten candidates, zero refutable), scaling law demonstration (ten candidates, zero refutable), and sample-efficient pretraining (ten candidates, zero refutable). The single sibling paper in the same taxonomy leaf may address related causal kernel techniques but did not appear as a refuting candidate. This limited search scope suggests that within the examined literature, the specific integration of causal models and GP kernels for TSFM pretraining appears novel, though the analysis does not cover the full breadth of causal time series or synthetic data research.

Based on top-thirty semantic matches and the sparse taxonomy leaf, the work appears to occupy a relatively unexplored niche combining causal structure with kernel-based synthesis for foundation model pretraining. The absence of refuting candidates among examined papers and the small sibling set suggest novelty within the analyzed scope, though a broader search across causal inference or time series generation communities might reveal additional related efforts not captured here.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: pretraining time series foundation models on synthetic data. The field organizes around four main branches that together address how synthetic data can enable scalable pretraining for time series. Synthetic Data Generation Methods explore diverse techniques—ranging from causal and structural approaches like CauKer[0] to GAN-based and diffusion-based generators—for creating realistic training signals. Pretraining Strategies and Architectures examine how models such as Chronos[7], Lag-Llama[11], and Timer[13] leverage these synthetic datasets alongside architectural choices (transformers, state-space models) to learn transferable representations. Application Domains and Task-Specific Models demonstrate the breadth of downstream uses, from finance (Finance Foundation Models[14]) and healthcare (EEG Classification[27]) to industrial IoT (IoT Synthetic[18]) and epidemic forecasting (Epidemic Forecasting[5]). Finally, Evaluation, Benchmarking, and Analysis investigates critical questions about data quality, zero-shot generalization (Zero-shot Anomaly Detection[2], Zero-shot Imputation[16]), and whether synthetic augmentation truly benefits model performance (Synthesize or Not[10], Synthetic vs Real[34]). A particularly active line of work contrasts purely data-driven generation (GANs, diffusion models) with methods that encode domain structure or causal mechanisms, trading off flexibility for interpretability and sample efficiency. CauKer[0] sits squarely within the Causal and Structural Approaches cluster, emphasizing how embedding causal knowledge into synthetic data generation can yield more robust pretraining signals than black-box methods. This contrasts with neighboring efforts like CauKer[1], which also explores causal kernels but may differ in how structural constraints are enforced or applied. Meanwhile, works such as Chronos[7] and Lag-Llama[11] demonstrate that large-scale pretraining on diverse (often purely synthetic) corpora can achieve strong zero-shot performance, raising open questions about when explicit causal modeling is necessary versus when sheer data scale suffices. The interplay between generation fidelity, structural inductive biases, and downstream task alignment remains a central theme across these branches.

Claimed Contributions

CAUKER synthetic data generation pipeline for time series classification

10 retrieved papers

The authors introduce CAUKER, a synthetic data generation method that combines Gaussian Process kernel composition with Structural Causal Models to produce time series data suitable for pre-training classification foundation models. This approach generates sequences with both realistic temporal patterns and meaningful clustering structure for classification tasks.

10 retrieved papers

Demonstration of clear scaling laws for synthetic pre-training data

10 retrieved papers

The authors show that pre-training on CAUKER-generated synthetic data reveals consistent scaling laws in both dataset size and model capacity, whereas real-world classification datasets exhibit irregular or absent scaling behavior. This represents the first systematic investigation of scaling laws in zero-shot time series classification.

10 retrieved papers

Sample-efficient pre-training achieving state-of-the-art classification performance

10 retrieved papers

The authors demonstrate that time series foundation models pre-trained exclusively on CAUKER-generated synthetic data can match or nearly match the performance of models trained on much larger real-world datasets, achieving competitive state-of-the-art results while being significantly more sample-efficient.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

Xie Shi-feng, Feofanov, Vasilii, Zhang Jianfeng, Palpanas, Themis, Redko, Ievgen (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

CAUKER synthetic data generation pipeline for time series classification

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

Cannot Refute

[51] Gaussian process regression for astronomical time series PDF

Cannot Refute

[52] gallifrey: JAX-based Gaussian process structure learning for astronomical time series PDF

Cannot Refute

[53] Nonlinear Causal Discovery via Dynamic Latent Variables PDF

Cannot Refute

[54] â¦ in evaporation prediction: introducing the Gated Recurrent UnitâMulti-Kernel Extreme Learning Machine (MKELM)âGaussian Process Regression (GPR) model PDF

Cannot Refute

[55] Predicting time series by data-driven spatiotemporal information transformation PDF

Cannot Refute

[56] Learning stationary time series using Gaussian processes with nonparametric kernels PDF

Cannot Refute

[57] S-ACF: A selective estimator for the autocorrelation function of irregularly sampled time series PDF

Cannot Refute

[58] Sequential Monte Carlo learning for time series structure discovery PDF

Cannot Refute

[59] Learning non-Gaussian Time Series using the Box-Cox Gaussian Process PDF

Cannot Refute

Contribution

Demonstration of clear scaling laws for synthetic pre-training data

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

Cannot Refute

[67] Time Series Generation with Masked Autoencoder PDF

Cannot Refute

[68] Utilizing image transforms and diffusion models for generative modeling of short and long time series PDF

Cannot Refute

[69] Scalable classifier-agnostic channel selection for multivariate time series classification PDF

Cannot Refute

[70] Classification of streaming time series under more realistic assumptions PDF

Cannot Refute

[71] Scalable Classifier-Agnostic Channel Selection for MTSC PDF

Cannot Refute

[72] Using matrix-product states for time-series machine learning PDF

Cannot Refute

[73] Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography PDF

Cannot Refute

[74] Robust scale-invariant normalization and similarity measurement for time series data PDF

Cannot Refute

[75] WinTSR: A Windowed Temporal Saliency Rescaling Method for Interpreting Time Series Deep Learning Models PDF

Cannot Refute

Contribution

Sample-efficient pre-training achieving state-of-the-art classification performance

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

Cannot Refute

[3] Finding Foundation Models for Time Series Classification with a PreText Task PDF

Cannot Refute

[33] TiCT: A Synthetically Pre-Trained Foundation Model for Time Series Classification PDF

Cannot Refute

[60] Pre-training Time Series Models with Stock Data Customization PDF

Cannot Refute

[61] UniMTS: Unified Pre-training for Motion Time Series PDF

Cannot Refute

[62] Can LLMs Serve As Time Series Anomaly Detectors? PDF

Cannot Refute

[63] The elephant in the room: Towards a reliable time-series anomaly detection benchmark PDF

Cannot Refute

[64] TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data PDF

Cannot Refute

[65] Time series anomaly detection using convolutional neural networks and transfer learning PDF

Cannot Refute

[66] Trend and seasonality features extraction with pre-trained CNN and recurrence plot PDF

Cannot Refute

CauKer: Classification Time Series Foundation Models Can Be Pretrained on Synthetic Data

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

Contribution Analysis

CAUKER synthetic data generation pipeline for time series classification

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

[51] Gaussian process regression for astronomical time series PDF

[52] gallifrey: JAX-based Gaussian process structure learning for astronomical time series PDF

[53] Nonlinear Causal Discovery via Dynamic Latent Variables PDF

[54] â¦ in evaporation prediction: introducing the Gated Recurrent UnitâMulti-Kernel Extreme Learning Machine (MKELM)âGaussian Process Regression (GPR) model PDF

[55] Predicting time series by data-driven spatiotemporal information transformation PDF

[56] Learning stationary time series using Gaussian processes with nonparametric kernels PDF

[57] S-ACF: A selective estimator for the autocorrelation function of irregularly sampled time series PDF

[58] Sequential Monte Carlo learning for time series structure discovery PDF

[59] Learning non-Gaussian Time Series using the Box-Cox Gaussian Process PDF

Demonstration of clear scaling laws for synthetic pre-training data

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

[67] Time Series Generation with Masked Autoencoder PDF

[68] Utilizing image transforms and diffusion models for generative modeling of short and long time series PDF

[69] Scalable classifier-agnostic channel selection for multivariate time series classification PDF

[70] Classification of streaming time series under more realistic assumptions PDF

[71] Scalable Classifier-Agnostic Channel Selection for MTSC PDF

[72] Using matrix-product states for time-series machine learning PDF

[73] Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography PDF

[74] Robust scale-invariant normalization and similarity measurement for time series data PDF

[75] WinTSR: A Windowed Temporal Saliency Rescaling Method for Interpreting Time Series Deep Learning Models PDF

Sample-efficient pre-training achieving state-of-the-art classification performance

[1] CauKer: classification time series foundation models can be pretrained on synthetic data only PDF

[3] Finding Foundation Models for Time Series Classification with a PreText Task PDF

[33] TiCT: A Synthetically Pre-Trained Foundation Model for Time Series Classification PDF

[60] Pre-training Time Series Models with Stock Data Customization PDF

[61] UniMTS: Unified Pre-training for Motion Time Series PDF

[62] Can LLMs Serve As Time Series Anomaly Detectors? PDF

[63] The elephant in the room: Towards a reliable time-series anomaly detection benchmark PDF

[64] TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data PDF

[65] Time series anomaly detection using convolutional neural networks and transfer learning PDF

[66] Trend and seasonality features extraction with pre-trained CNN and recurrence plot PDF

Table of Contents

[54] â¦ in evaporation prediction: introducing the Gated Recurrent UnitâMulti-Kernel Extreme Learning Machine (MKELM)âGaussian Process Regression (GPR) model PDF