Foundation Models for Causal Inference via Prior-Data Fitted Networks

ICLR 2026 Conference SubmissionAnonymous Authors
Causal InferenceTreatment Effect EstimationFoundation Models
Abstract:

Prior-data fitted networks (PFNs) have recently been proposed as a promising way to train tabular foundation models. PFNs are transformers that are pre-trained on synthetic data generated from a prespecified prior distribution and that enable Bayesian inference through in-context learning. In this paper, we introduce CausalFM, a comprehensive framework for training PFN-based foundation models in various causal inference settings. First, we formalize the construction of Bayesian priors for causal inference based on structural causal models (SCMs) in a principled way and derive necessary criteria for the validity of such priors. Building on this, we propose a novel family of prior distributions using causality-inspired Bayesian neural networks that enable CausalFM to perform Bayesian causal inference in various settings, including back-door, front-door, and instrumental variable adjustment. Finally, we instantiate CausalFM and train our foundation models for estimating conditional average treatment effects (CATEs) for different settings. We show that CausalFM performs competitively for CATE estimation using various synthetic and semi-synthetic benchmarks. In sum, our framework can be used as a general recipe to train foundation models for various causal inference settings. In contrast to the current state-of-the-art in causal inference, CausalFM offers a novel paradigm with the potential to fundamentally change how practitioners perform causal inference in medicine, economics, and other disciplines.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces CausalFM, a framework for training PFN-based foundation models to perform Bayesian causal inference across multiple identification strategies (back-door, front-door, instrumental variables). It resides in the 'Amortized Causal Effect Estimation' leaf, which contains only three papers total, including this one. This is a relatively sparse research direction within the broader taxonomy of 41 papers across 20 leaf nodes, suggesting that PFN-based amortized causal inference remains an emerging area with limited prior work directly addressing the same scope.

The taxonomy reveals that CausalFM sits within a larger branch of 'Causal Inference via PFN-Based Foundation Models' (seven papers across four leaves), which itself is one of four major branches. Neighboring leaves address causal discovery using PFN embeddings and causal fairness applications, while sibling branches explore non-PFN foundation models (LLMs, diffusion models) for causal reasoning and domain-specific integrations. The scope note for the parent branch explicitly focuses on 'methods using PFN architectures to estimate causal effects,' distinguishing this work from general tabular prediction methods and non-PFN causal approaches found elsewhere in the taxonomy.

Among 30 candidates examined, the first contribution (CausalFM framework) shows one refutable candidate out of 10 examined, indicating some overlap with existing PFN-based causal inference work within this limited search scope. The second contribution (formalization of Bayesian priors for causal inference based on SCMs) and third contribution (causality-inspired Bayesian neural network priors) each examined 10 candidates with zero refutations, suggesting these specific technical elements may be more novel. However, the search scale is modest—30 candidates total—so these statistics reflect top-K semantic matches rather than exhaustive coverage of the causal inference literature.

Given the sparse taxonomy leaf (three papers) and limited search scope, CausalFM appears to occupy a relatively underexplored niche at the intersection of PFN architectures and multi-strategy causal inference. The framework-level contribution shows some prior work overlap, while the technical innovations around SCM-based priors and causality-inspired BNN distributions appear less directly anticipated by the examined candidates. A more exhaustive search beyond top-30 semantic matches would be needed to assess whether these elements have precedents in the broader causal inference or Bayesian deep learning communities.

Taxonomy

Core-task Taxonomy Papers
41
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: causal inference via foundation models using prior-data fitted networks. The field structure reflects a convergence of modern deep learning architectures with classical causal inference objectives. At the highest level, the taxonomy distinguishes between works that develop PFN-specific architectures and training procedures (such as TabPFN[8] and its extensions), those that apply PFN-based models directly to causal tasks like effect estimation and discovery, and a parallel stream exploring non-PFN foundation models (including large language models like Causal Reasoning LLMs[3] and Causality LLMs[9]) for causal reasoning. Additional branches capture domain-specific integrations—ranging from molecular causality to climate risk—and broader conceptual perspectives on how causality and foundation models interact, as seen in works like Causal Foundation Duality[19] and Causal Attention Duality[21]. Within the PFN-based causal inference branch, a particularly active line focuses on amortized causal effect estimation, where models learn to predict treatment effects in-context without retraining. Foundation Models Causal Inference[0] sits squarely in this cluster, emphasizing rapid, amortized inference for causal queries. Nearby works like CausalPFN[12] and Do-PFN[14] share this amortization theme but may differ in how they handle confounding or interventional distributions. In contrast, other PFN applications target causal discovery (Amortized Causal Discovery[22]) or fairness-aware prediction (FairPFN[23]), illustrating the breadth of causal tasks that PFNs can address. Meanwhile, non-PFN approaches such as Tabular Foundation Model[1] and Bayesian Tabular Foundation[5] offer alternative pathways for tabular causal inference, trading off the in-context learning speed of PFNs against potentially richer uncertainty quantification or broader applicability. The central tension across these directions revolves around balancing expressiveness, computational efficiency, and the ability to generalize across diverse causal structures with minimal task-specific tuning.

Claimed Contributions

CausalFM framework for training PFN-based foundation models for causal inference

The authors propose CausalFM, a general framework that enables training prior-data fitted network (PFN) foundation models to perform causal inference across multiple settings including back-door, front-door, and instrumental variable adjustment. This framework allows practitioners to perform causal inference through in-context learning without retraining for each new dataset.

10 retrieved papers
Can Refute
Formalization of Bayesian priors for causal inference based on structural causal models

The authors provide a principled formalization for constructing Bayesian priors based on structural causal models (SCMs) for causal inference. They derive necessary validity criteria for such priors, including the concept of well-specified priors that ensure consistent estimation of causal queries.

10 retrieved papers
Novel family of prior distributions using causality-inspired Bayesian neural networks

The authors introduce a new family of prior distributions that leverage Bayesian neural networks designed to respect the causal structure of the inference problem. These priors enable CausalFM to perform Bayesian causal inference across different settings while providing identifiability guarantees.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

CausalFM framework for training PFN-based foundation models for causal inference

The authors propose CausalFM, a general framework that enables training prior-data fitted network (PFN) foundation models to perform causal inference across multiple settings including back-door, front-door, and instrumental variable adjustment. This framework allows practitioners to perform causal inference through in-context learning without retraining for each new dataset.

Contribution

Formalization of Bayesian priors for causal inference based on structural causal models

The authors provide a principled formalization for constructing Bayesian priors based on structural causal models (SCMs) for causal inference. They derive necessary validity criteria for such priors, including the concept of well-specified priors that ensure consistent estimation of causal queries.

Contribution

Novel family of prior distributions using causality-inspired Bayesian neural networks

The authors introduce a new family of prior distributions that leverage Bayesian neural networks designed to respect the causal structure of the inference problem. These priors enable CausalFM to perform Bayesian causal inference across different settings while providing identifiability guarantees.

Foundation Models for Causal Inference via Prior-Data Fitted Networks | Novelty Validation