Are EEG Foundation Models Worth It? Comparative Evaluation with Traditional Decoders in Diverse BCI Tasks

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

Foundation ModelBrain–Computer InterfaceEEGBenchmark

Foundation models have recently emerged as a promising approach for learning generalizable EEG representations for brain–computer interfaces (BCIs). Yet, their true advantages over traditional methods—particularly classical non-neural approaches—remain unclear. In this work, we present a comprehensive benchmark of state-of-the-art EEG foundation models, evaluated across diverse datasets, decoding tasks, and six evaluation protocols, with rigorous statistical testing. We introduce spatiotemporal EEGFormer (ST-EEGFormer), a simple yet effective Vision Transformer (ViT)-based baseline, pre-trained solely with masked autoencoding (MAE) on over 8M EEG segments. Our results show that while fine-tuned foundation models perform well in data-rich, population-level settings, they often fail to significantly outperform compact neural networks or even classical non-neural decoders in data-scarce scenarios. Furthermore, linear probing remains consistently weak, and performance varies greatly across downstream tasks, with no clear scaling law observed among neural network decoders. These findings expose a substantial gap between pre-training and downstream fine-tuning, often diminishing the benefits of complex pre-training tasks. We further identify hidden architectural factors that affect performance and emphasize the need for transparent, statistically rigorous evaluation. Overall, this study calls for community-wide efforts to construct large-scale EEG datasets and for fair, reproducible benchmarks to advance EEG foundation models.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper contributes a comprehensive benchmark of EEG foundation models evaluated across diverse datasets and six evaluation protocols, alongside ST-EEGFormer, a Vision Transformer baseline pre-trained with masked autoencoding on over 8M EEG segments. It resides in the 'Comprehensive Benchmarking Frameworks' leaf, which contains only three papers total, indicating a relatively sparse research direction within the broader taxonomy. This positioning reflects the emerging nature of systematic foundation model evaluation in EEG-based BCIs, where rigorous multi-protocol benchmarking remains uncommon despite growing interest in large-scale pre-training approaches.

The taxonomy reveals substantial activity in adjacent areas: the 'EEG Foundation Model Architectures and Pre-training Strategies' branch contains 16 papers across transformer-based, alternative, and hybrid approaches, while 'Application-Specific Adaptations' includes 13 papers targeting motor imagery, language decoding, and clinical tasks. The 'Comparative Analysis and Performance Assessment' leaf, a sibling category, houses four papers examining foundation model capabilities versus traditional methods. The paper bridges these domains by systematically evaluating architectural innovations from the foundation model branch against classical baselines, addressing the gap between pre-training research and practical deployment concerns highlighted in the comparative analysis cluster.

Among 30 candidates examined, Contribution A (comprehensive benchmark framework) shows one refutable candidate from 10 examined, suggesting some prior benchmarking efforts exist but remain limited in scope. Contribution B (ST-EEGFormer architecture) encountered no refutations across 10 candidates, indicating architectural novelty within the examined sample. Contribution C (empirical findings on foundation model limitations) identified four refutable candidates from 10 examined, reflecting existing discourse on classical baseline competitiveness. The search scale is modest, focusing on top-K semantic matches rather than exhaustive coverage, meaning these statistics characterize the immediate research neighborhood rather than the entire field.

Based on the limited search scope, the work appears to occupy a sparsely populated benchmarking niche while engaging with well-established debates about foundation model utility. The taxonomy structure confirms that systematic multi-protocol evaluation remains underexplored compared to architecture development, though the empirical findings align with emerging skepticism documented in comparative analysis literature. The analysis covers top-30 semantic matches and does not claim exhaustive prior work coverage.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Benchmarking EEG foundation models for brain-computer interface decoding tasks. The field has evolved from traditional neural network architectures for EEG decoding toward large-scale foundation models that leverage pre-training on diverse datasets. The taxonomy reflects this shift through several main branches: one focused on EEG Foundation Model Architectures and Pre-training Strategies, where works like Neuro GPT[5] and Large Brain Model[14] explore transformer-based and generative approaches; another on Application-Specific Adaptations and Paradigm-Focused Models, addressing specialized BCI paradigms such as motor imagery or event-related potentials; a branch on Transfer Learning and Cross-Domain Adaptation, examining how models generalize across subjects and tasks; and a branch on Evaluation, Benchmarking, and Comparative Analysis, which systematically assesses model performance. Traditional Neural Network Architectures for EEG Decoding and Survey and Review Literature branches provide historical context and synthesize emerging trends, with reviews like LLM EEG Survey[17] and Brain Decoding Survey[45] offering broad perspectives. Recent efforts have concentrated on establishing rigorous evaluation protocols and understanding the practical value of foundation models in real-world BCI scenarios. Works such as Adabrain Bench[1] and Benchmarking ERP Analysis[49] provide structured frameworks for comparing models across multiple decoding tasks, while EEG Foundation Worth[0] sits squarely within this comprehensive benchmarking cluster. Unlike narrower evaluations that focus on single paradigms, EEG Foundation Worth[0] emphasizes systematic assessment of whether foundation models deliver meaningful improvements over task-specific baselines, echoing concerns raised in Adabrain Bench[1] about generalization and calibration efficiency. This contrasts with application-driven studies like Decoding Pain[3], which prioritize domain-specific performance. The central question across these benchmarking efforts remains whether the computational overhead and data requirements of foundation models justify their adoption, particularly when traditional architectures like EEGNet[29] continue to perform competitively in constrained settings.

Claimed Contributions

Comprehensive benchmark of EEG foundation models with six-dimensional evaluation framework

Can Refute

10 retrieved papers

The authors introduce a systematic evaluation framework spanning six protocols (Population, Per-Subject Self, Per-Subject Transfer, LOO Zero-Shot, LOO Fine-Tune, and LOO Drop) to assess foundation models against classical neural and non-neural decoders across seven classification and two regression tasks, involving training over 20,000 models with statistical rigor.

10 retrieved papers

Can Refute

ST-EEGFormer: Vision Transformer-based foundation model with masked autoencoding

10 retrieved papers

The authors propose ST-EEGFormer, a transparent baseline foundation model built on Vision Transformer architecture and pre-trained using only masked autoencoding on raw EEG signals from more than 8 million segments, demonstrating that simple pre-training can be effective contrary to prevailing views.

10 retrieved papers

Empirical findings on foundation model limitations and classical baseline competitiveness

Can Refute

10 retrieved papers

The study reveals that foundation models do not universally outperform simpler approaches, particularly in low-data regimes; linear probing remains consistently weak, performance varies greatly across tasks, and no clear scaling law emerges among neural decoders, exposing gaps between pre-training and downstream fine-tuning.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Adabrain-bench: Benchmarking brain foundation models for brain-computer interface applications PDF

Wu Jia-Min, Wang Junyu, Zhu Peng-yu, Song Yonghao, Liu, Mianxin, Zheng Qihao, Bai Lei, Ouyang, Wanli, Song Chunfeng (2025)

[49] Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models PDF

Yihe Wang, Zhiqiao Kang, Bohan Chen, Yu Zhang, Xiang Zhang (2026)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Comprehensive benchmark of EEG foundation models with six-dimensional evaluation framework

[1] Adabrain-bench: Benchmarking brain foundation models for brain-computer interface applications PDF

Can Refute

[11] Eeg-dino: Learning eeg foundation models via hierarchical self-distillation PDF

Cannot Refute

[13] Bridging Brain with Foundation Models through Self-Supervised Learning PDF

Cannot Refute

[26] Assessing the Capabilities of Large Brainwave Foundation Models PDF

Cannot Refute

[49] Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models PDF

Cannot Refute

[53] EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications PDF

Cannot Refute

[67] LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection PDF

Cannot Refute

[68] SzCORE: seizure community openâsource research evaluation framework for the validation of electroencephalographyâbased automated seizure detection â¦ PDF

Cannot Refute

[69] EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models PDF

Cannot Refute

[70] Tokenizing Single-Channel EEG with Time-Frequency Motif Learning PDF

Cannot Refute

Contribution

ST-EEGFormer: Vision Transformer-based foundation model with masked autoencoding

[36] Foundation models for EEG decoding: current progress and prospective research PDF

Cannot Refute

[58] MAE-EEG-Transformer: A transformer-based approach combining masked autoencoder and cross-individual data augmentation pre-training for EEG classification PDF

Cannot Refute

[59] HARFormer: A Masked Self-supervised Transformer-base Model for Human Activity Recognition with Predicting Somatosensory Tokens PDF

Cannot Refute

[60] DreamDiffusion: High-quality EEG-to-image generation with temporal masked signal modeling and CLIP alignment PDF

Cannot Refute

[61] WAVELET2VEC: A filter bank masked autoencoder for EEG-based seizure subtype classification PDF

Cannot Refute

[62] Multidimensional EEG Signal Analysis and Vision Transformer-Masked Autoencoder-Based Image Processing for Alzheimer's Disease Detection PDF

Cannot Refute

[63] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning PDF

Cannot Refute

[64] Enhancing brain-machine interface EEG-based classification using deep learning PDF

Cannot Refute

[65] EveryBrain: Generate EEG Responses From Images For Specified Individuals PDF

Cannot Refute

[66] Deep Learning Architectures for EEG: from CNN to Transformers PDF

Cannot Refute

Contribution

Empirical findings on foundation model limitations and classical baseline competitiveness

[34] Foundation Models for Decoding Brain Activity-Benchmarking PDF

Can Refute

[53] EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications PDF

Can Refute

[55] Cross-subject generalisation in diverse electroencephalogram classification tasks: a deep learning and large language model perspective PDF

Can Refute

[57] Handwriting decoding as a challenging Motor Imagery task for EEG Foundation Models PDF

Can Refute

[4] Cbramod: A criss-cross brain foundation model for eeg decoding PDF

Cannot Refute

[48] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding PDF

Cannot Refute

[51] Leveraging Generic Time Series Foundation Models for EEG Classification PDF

Cannot Refute

[52] Review for "Foundation models for EEG decoding: current progress and prospective research" PDF

Cannot Refute

[54] ST-CoG-XAI: A Spectro-Temporal Contrastive Generation Foundation Model for Explainable EEG Decoding PDF

Cannot Refute

[56] BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals PDF

Cannot Refute

Are EEG Foundation Models Worth It? Comparative Evaluation with Traditional Decoders in Diverse BCI Tasks

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Adabrain-bench: Benchmarking brain foundation models for brain-computer interface applications PDF

[49] Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models PDF

Contribution Analysis

Comprehensive benchmark of EEG foundation models with six-dimensional evaluation framework

[1] Adabrain-bench: Benchmarking brain foundation models for brain-computer interface applications PDF

[11] Eeg-dino: Learning eeg foundation models via hierarchical self-distillation PDF

[13] Bridging Brain with Foundation Models through Self-Supervised Learning PDF

[26] Assessing the Capabilities of Large Brainwave Foundation Models PDF

[49] Benchmarking ERP Analysis: Manual Features, Deep Learning, and Foundation Models PDF

[53] EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications PDF

[67] LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease Detection PDF

[68] SzCORE: seizure community openâsource research evaluation framework for the validation of electroencephalographyâbased automated seizure detection â¦ PDF

[69] EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models PDF

[70] Tokenizing Single-Channel EEG with Time-Frequency Motif Learning PDF

ST-EEGFormer: Vision Transformer-based foundation model with masked autoencoding

[36] Foundation models for EEG decoding: current progress and prospective research PDF

[58] MAE-EEG-Transformer: A transformer-based approach combining masked autoencoder and cross-individual data augmentation pre-training for EEG classification PDF

[59] HARFormer: A Masked Self-supervised Transformer-base Model for Human Activity Recognition with Predicting Somatosensory Tokens PDF

[60] DreamDiffusion: High-quality EEG-to-image generation with temporal masked signal modeling and CLIP alignment PDF

[61] WAVELET2VEC: A filter bank masked autoencoder for EEG-based seizure subtype classification PDF

[62] Multidimensional EEG Signal Analysis and Vision Transformer-Masked Autoencoder-Based Image Processing for Alzheimer's Disease Detection PDF

[63] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning PDF

[64] Enhancing brain-machine interface EEG-based classification using deep learning PDF

[65] EveryBrain: Generate EEG Responses From Images For Specified Individuals PDF

[66] Deep Learning Architectures for EEG: from CNN to Transformers PDF

Empirical findings on foundation model limitations and classical baseline competitiveness

[34] Foundation Models for Decoding Brain Activity-Benchmarking PDF

[53] EEG-Bench: A Benchmark for EEG Foundation Models in Clinical Applications PDF

[55] Cross-subject generalisation in diverse electroencephalogram classification tasks: a deep learning and large language model perspective PDF

[57] Handwriting decoding as a challenging Motor Imagery task for EEG Foundation Models PDF

[4] Cbramod: A criss-cross brain foundation model for eeg decoding PDF

[48] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding PDF

[51] Leveraging Generic Time Series Foundation Models for EEG Classification PDF

[52] Review for "Foundation models for EEG decoding: current progress and prospective research" PDF

[54] ST-CoG-XAI: A Spectro-Temporal Contrastive Generation Foundation Model for Explainable EEG Decoding PDF

[56] BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals PDF

Table of Contents

[68] SzCORE: seizure community openâsource research evaluation framework for the validation of electroencephalographyâbased automated seizure detection â¦ PDF