ActiveCQ: Active Estimation of Causal Quantities

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Causal QuantitiesActive LearningUncertainty Quantification

Estimating causal quantities (CQs) typically requires large datasets, which can be expensive to obtain, especially when measuring individual outcomes is costly. This challenge highlights the importance of sample-efficient active learning strategies. To address the narrow focus of prior work on the conditional average treatment effect, we formalize the broader task of Actively estimating Causal Quantities (ActiveCQ) and propose a unified framework for this general problem. Built upon the insight that many CQs are integrals of regression functions, our framework models the regression function with a Gaussian Process. For the distribution component, we explore both a baseline using explicit density estimators and a more integrated method using conditional mean embeddings in a reproducing kernel Hilbert space. This latter approach offers key advantages: it bypasses explicit density estimation, operates within the same function space as the GP, and adaptively refines the distributional model after each update. Our framework enables the principled derivation of acquisition strategies from the CQ's posterior uncertainty; we instantiate this principle with two utility functions based on information gain and total variance reduction. A range of simulated and semi-synthetic experiments demonstrate that our principled framework significantly outperforms relevant baselines, achieving substantial gains in sample efficiency across a variety of CQs.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper formalizes the ActiveCQ task and proposes a unified framework for actively estimating general causal quantities beyond the conditional average treatment effect. It resides in the 'Conditional Average Treatment Effect Estimation' leaf, which contains four papers total (including this one). This leaf sits within the broader 'Active Learning for Treatment Effect Estimation' branch, indicating a moderately populated research direction. The taxonomy reveals that while CATE estimation has received focused attention, the generalization to arbitrary causal quantities represents a less crowded extension of this established area.

The taxonomy tree shows neighboring leaves addressing average treatment effects with adaptive design, network interference settings, and data-efficient observational methods. The paper's position suggests it bridges CATE-focused work with the broader 'Foundational Methods' branch, particularly Bayesian active learning frameworks that handle sequential experimental design. The scope notes clarify that this work diverges from causal structure discovery (a separate major branch with eleven papers across four leaves) and optimal intervention design, instead concentrating on efficient estimation of predefined causal targets through adaptive sampling.

Among twenty-five candidates examined across three contributions, none were found to clearly refute any component. The first contribution (ActiveCQ formalization) examined nine candidates with zero refutations, suggesting limited prior work on this generalized task framing. The second contribution (Gaussian Process with Conditional Mean Embeddings) also examined nine candidates without refutation, indicating potential novelty in this modeling choice for causal quantities. The third contribution (acquisition strategies from posterior uncertainty) examined seven candidates, again with no clear overlaps. These statistics reflect a focused semantic search rather than exhaustive coverage, so undetected prior work remains possible.

Based on the limited search scope of top-twenty-five semantic matches, the work appears to occupy a relatively sparse position at the intersection of CATE estimation and general causal quantity inference. The absence of refutations across all contributions, combined with the taxonomy's structure showing only three sibling papers in the same leaf, suggests the generalization beyond CATE may be underexplored. However, the search scale leaves open the possibility of relevant work in adjacent communities not captured by this analysis.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: active estimation of causal quantities using sample-efficient learning strategies. The field organizes around several complementary branches that address different facets of causal inference under data scarcity. Active Learning for Causal Structure Discovery focuses on efficiently uncovering graphical relationships among variables, often through strategic intervention selection (e.g., Causal Network Structure[2], Bivariate Causal Discovery[37]). Active Learning for Treatment Effect Estimation targets the precise quantification of intervention impacts, including conditional average treatment effects where methods like Causal Trees[3] and Causal BALD[13] guide sample allocation to regions of high uncertainty. Active Learning for Optimal Intervention Design emphasizes choosing which variables to manipulate and at what levels to maximize information gain or optimize downstream objectives (Optimal Intervention Design[14]). Causal Inference in Reinforcement Learning and Sequential Decision-Making bridges causal reasoning with dynamic environments, while Foundational Methods and Cross-Domain Applications provide theoretical underpinnings and demonstrate utility in domains ranging from biological networks to telecommunications. Within treatment effect estimation, a central tension emerges between model-based approaches that leverage parametric assumptions for efficiency and nonparametric methods that prioritize flexibility at the cost of sample complexity. Works like Causal EPIG[1] and Counterfactual Covering[6] exemplify information-theoretic acquisition strategies that balance exploration of covariate space with exploitation of known effect heterogeneity. ActiveCQ[0] situates itself in this landscape by proposing sample-efficient strategies for conditional treatment effects, sharing methodological kinship with Causal BALD[13] in its use of uncertainty quantification but differing in how it prioritizes queries across subpopulations. Compared to tree-based partitioning methods such as Causal Trees[3], ActiveCQ[0] appears to emphasize adaptive querying mechanisms that refine estimates iteratively. Open questions persist around scalability to high-dimensional covariates, robustness to model misspecification, and the trade-off between local precision and global coverage in heterogeneous effect landscapes.

Claimed Contributions

Formalization of Active Estimation of Causal Quantities (ActiveCQ) task and unified framework

9 retrieved papers

The authors formalize a new task called ActiveCQ that extends beyond the narrow focus on conditional average treatment effect (CATE) to encompass a broader class of causal quantities. They propose a unified framework that represents diverse causal quantities as integrals of regression functions, enabling systematic treatment of multiple causal estimation problems.

9 retrieved papers

Gaussian Process modeling with Conditional Mean Embeddings for causal quantity estimation

9 retrieved papers

The framework models the regression function using a Gaussian Process and represents the target distribution component via conditional mean embeddings (CMEs) in a reproducing kernel Hilbert space. This approach bypasses explicit density estimation, operates within the same function space as the GP, and adaptively refines the distributional model after each update.

9 retrieved papers

Principled derivation of acquisition strategies from posterior uncertainty

7 retrieved papers

The authors derive acquisition strategies systematically from the posterior uncertainty of the causal quantity of interest. They instantiate this principle with two utility functions: information gain and total variance reduction, which are expressed in closed-form and automatically tailored to the specific causal quantity being estimated.

7 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Causal-EPIG: A prediction-oriented active learning framework for cate estimation PDF

Gao, Erdun, Fawkes, Jake, Erdun Gao, Sejdinovic, Dino, Jake Fawkes, Dino Sejdinovic (2025)

[6] Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering Perspective PDF

Wen, Hechuan, Chen Tong, Hechuan Wen, Gong, Mingming, Tong Chen, Chai, Li Kheng, Mingming Gong, Sadiq, Shazia, Li Kheng Chai, Yin, Hongzhi, S. Sadiq, Hongzhi Yin (2025)

[13] Causal-bald: Deep bayesian active learning of outcomes to infer treatment-effects from observational data PDF

Jesson, Andrew, Tigas, Panagiotis, A. Jesson, van Amersfoort, Joost, P. Tigas, Kirsch, Andreas, Joost R. van Amersfoort, Shalit, Uri, Andreas Kirsch, Gal, Yarin, Uri Shalit, Y. Gal (2021)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Formalization of Active Estimation of Causal Quantities (ActiveCQ) task and unified framework

[9] Active bayesian causal inference PDF

Cannot Refute

[10] Integrating Active Learning in Causal Inference with Interference: A Novel Approach in Online Experiments PDF

Cannot Refute

[12] Active causal learning for decoding chemical complexities with targeted interventions PDF

Cannot Refute

[13] Causal-bald: Deep bayesian active learning of outcomes to infer treatment-effects from observational data PDF

Cannot Refute

[14] Active learning for optimal intervention design in causal models PDF

Cannot Refute

[17] ACE: Active learning for causal inference with expensive experiments PDF

Cannot Refute

[20] Active learning of causal networks with intervention experiments and optimal designs PDF

Cannot Refute

[56] Active invariant causal prediction: Experiment selection through stability PDF

Cannot Refute

[57] Two optimal strategies for active learning of causal models from interventional data PDF

Cannot Refute

Contribution

Gaussian Process modeling with Conditional Mean Embeddings for causal quantity estimation

[17] ACE: Active learning for causal inference with expensive experiments PDF

Cannot Refute

[38] Active learning of conditional mean embeddings via bayesian optimisation PDF

Cannot Refute

[39] Bayesian deconditional kernel mean embeddings PDF

Cannot Refute

[40] Gaussian Processes for Observational Dose-Response Inference PDF

Cannot Refute

[41] Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings PDF

Cannot Refute

[42] Kernel Embeddings and Gaussian Processes: Applications in Causal Data Fusion and Statistical Downscaling PDF

Cannot Refute

[43] Kernel Synthetic Control: A Proxy Variable Viewpoint PDF

Cannot Refute

[44] Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings PDF

Cannot Refute

[45] Recent Developments at the Interface Between Kernel Embeddings and Gaussian Processes PDF

Cannot Refute

Contribution

Principled derivation of acquisition strategies from posterior uncertainty

[47] Semiparametric posterior corrections PDF

Cannot Refute

[49] Interventions, where and how? experimental design for causal models at scale PDF

Cannot Refute

[50] A theory of statistical inference for matching methods in causal research PDF

Cannot Refute

[51] Bayesian causal inference for discrete data PDF

Cannot Refute

[52] A Bayesian multivariate factor analysis model for causal inference using time-series observational data on mixed outcomes PDF

Cannot Refute

[53] Machine learning for causal inference PDF

Cannot Refute

[54] Inferring causal impact using Bayesian structural time-series models PDF

Cannot Refute

ActiveCQ: Active Estimation of Causal Quantities

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Causal-EPIG: A prediction-oriented active learning framework for cate estimation PDF

[6] Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering Perspective PDF

[13] Causal-bald: Deep bayesian active learning of outcomes to infer treatment-effects from observational data PDF

Contribution Analysis

Formalization of Active Estimation of Causal Quantities (ActiveCQ) task and unified framework

[9] Active bayesian causal inference PDF

[10] Integrating Active Learning in Causal Inference with Interference: A Novel Approach in Online Experiments PDF

[12] Active causal learning for decoding chemical complexities with targeted interventions PDF

[13] Causal-bald: Deep bayesian active learning of outcomes to infer treatment-effects from observational data PDF

[14] Active learning for optimal intervention design in causal models PDF

[17] ACE: Active learning for causal inference with expensive experiments PDF

[20] Active learning of causal networks with intervention experiments and optimal designs PDF

[56] Active invariant causal prediction: Experiment selection through stability PDF

[57] Two optimal strategies for active learning of causal models from interventional data PDF

Gaussian Process modeling with Conditional Mean Embeddings for causal quantity estimation

[17] ACE: Active learning for causal inference with expensive experiments PDF

[38] Active learning of conditional mean embeddings via bayesian optimisation PDF

[39] Bayesian deconditional kernel mean embeddings PDF

[40] Gaussian Processes for Observational Dose-Response Inference PDF

[41] Sequential Decision Making on Unmatched Data using Bayesian Kernel Embeddings PDF

[42] Kernel Embeddings and Gaussian Processes: Applications in Causal Data Fusion and Statistical Downscaling PDF

[43] Kernel Synthetic Control: A Proxy Variable Viewpoint PDF

[44] Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings PDF

[45] Recent Developments at the Interface Between Kernel Embeddings and Gaussian Processes PDF

Principled derivation of acquisition strategies from posterior uncertainty

[47] Semiparametric posterior corrections PDF

[49] Interventions, where and how? experimental design for causal models at scale PDF

[50] A theory of statistical inference for matching methods in causal research PDF

[51] Bayesian causal inference for discrete data PDF

[52] A Bayesian multivariate factor analysis model for causal inference using time-series observational data on mixed outcomes PDF

[53] Machine learning for causal inference PDF

[54] Inferring causal impact using Bayesian structural time-series models PDF

Table of Contents