Overlap-weighted orthogonal meta-learner for treatment effect estimation over time

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

causal inferenceheterogeneous treatment effectstime-varying treatmentsmeta-learnersmachine learning for healthcare

Estimating heterogeneous treatment effects (HTEs) in time-varying settings is particularly challenging, as the probability of observing certain treatment sequences decreases exponentially with longer prediction horizons. Thus, the observed data contain little support for many plausible treatment sequences, which creates severe overlap problems. Existing meta-learners for the time-varying setting typically assume adequate treatment overlap, and thus suffer from exploding estimation variance when the overlap is low. To address this problem, we introduce a novel overlap-weighted orthogonal WO meta-learner for estimating HTEs that targets regions in the observed data with high probability of receiving the interventional treatment sequences. This offers a fully data-driven approach through which our WO-learner can counteract instabilities as in existing meta-learners and thus obtain more reliable HTE estimates. Methodologically, we develop a novel Neyman-orthogonal population risk function that minimizes the overlap-weighted oracle risk. We show that our WO-learner has the favorable property of Neyman-orthogonality, meaning that it is robust against misspecification in the nuisance functions. Further, our WO-learner is fully model-agnostic and can be applied to any machine learning model. Through extensive experiments with both transformer and LSTM backbones, we demonstrate the benefits of our novel WO-learner.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: heterogeneous treatment effect estimation in time-varying settings. This field addresses the challenge of identifying how treatment effects differ across individuals when both treatments and confounders evolve over time. The taxonomy reveals a rich methodological landscape organized around data structures and modeling paradigms. Several branches focus on specific temporal data types: Longitudinal Observational Data with Time-Dependent Confounding tackles sequential confounding using methods like g-computation and marginal structural models (Time Varying Confounders[9], Variational Temporal Deconfounder[16]); Time-to-Event and Survival Outcomes extends HTE estimation to censored data (Survite[14], Competing Events HTE[1]); Panel Data and Difference-in-Differences Designs exploits repeated cross-sections or panel structure for policy evaluation (DiD Time Varying[7], TWFE Comparison[31]). Other branches emphasize decision-making frameworks: Dynamic Treatment Regimes and Sequential Decision Rules optimize multi-stage treatment policies (Dynamic Treatment Regimes[48], Optimal Dynamic Regimes[13]); Mobile Health and N-of-1 Trials address personalized interventions in digital health contexts (N of 1 Trials[42], Mobile Device Causal[27]). Meanwhile, Meta-Learners and Outcome Modeling Approaches adapt flexible machine learning architectures to time-varying settings, and Real-World Evidence branches connect methods to electronic health records and pragmatic trials (Patient Data Treatment Effects[2], Real World Evidence[28]). A central tension across these branches involves balancing flexible modeling of temporal dynamics against the need for robust causal identification under time-dependent confounding. Many works in the longitudinal observational branch develop neural or variational approaches to capture complex confounder trajectories (Copula CNN LSTM[3], Neural SDEs Trajectories[21]), while panel-based methods leverage fixed effects and parallel trends assumptions for identification (Interactive Fixed Effects[10], High Dimensional Panels[22]). The original paper, Overlap Weighted Orthogonal[0], sits within the Meta-Learners and Outcome Modeling branch, specifically focusing on overlap-weighted and orthogonal estimators. This positions it alongside flexible outcome-modeling strategies that adapt cross-sectional meta-learner frameworks to temporal data, emphasizing efficient estimation and robustness to model misspecification. Compared to works like ODE Longitudinal HTE[35] that embed mechanistic differential equation models or Temporal Uplift Modeling[50] that target marketing applications, Overlap Weighted Orthogonal[0] appears to prioritize statistical efficiency and doubly-robust properties in settings where overlap and covariate balance vary over time, bridging classical semiparametric theory with modern machine learning tools for time-varying HTE estimation.

Claimed Contributions

Overlap-weighted orthogonal meta-learner for time-varying treatment effect estimation

1 retrieved paper

The authors introduce a new meta-learner that addresses low-overlap regimes in time-varying settings by minimizing a novel overlap-weighted oracle risk. This approach provides stable HTE estimates even when the probability of observing treatment sequences is low, which is a common problem with longer prediction horizons.

1 retrieved paper

Neyman-orthogonal population risk function for weighted oracle risk minimization

Can Refute

10 retrieved papers

The authors derive a population risk function that is Neyman-orthogonal with respect to all nuisance functions, meaning it is robust against misspecification in nuisance parameters. This ensures that estimation errors in nuisance functions do not propagate as first-order biases into the final HTE estimate.

10 retrieved papers

Can Refute

Model-agnostic framework applicable to any machine learning backbone

10 retrieved papers

The proposed WO-learner is designed as a general estimation strategy that can be instantiated with different machine learning backbones such as transformers or LSTMs, making it flexible and broadly applicable across different modeling approaches.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Overlap-weighted orthogonal meta-learner for time-varying treatment effect estimation

[61] Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data PDF

Cannot Refute

Contribution

Neyman-orthogonal population risk function for weighted oracle risk minimization

[51] Orthogonal statistical learning PDF

Can Refute

[55] An introduction to double/debiased machine learning PDF

Can Refute

[60] Double/debiased machine learning for treatment and structural parameters PDF

Can Refute

[52] Automatic debiased machine learning for smooth functionals of nonparametric m-estimands PDF

Cannot Refute

[53] Estimation and inference for causal functions with multiway clustered data PDF

Cannot Refute

[54] Orthogonal random forest for causal inference PDF

Cannot Refute

[56] GMM with Many Weak Moment Conditions and Nuisance Parameters: General Theory and Applications to Causal Inference PDF

Cannot Refute

[57] Debiased Maximum Likelihood Estimators of Hazard Ratios Under Kernel-Based Machine Learning Adjustment PDF

Cannot Refute

[58] Debiased Machine Learning for Unobserved Heterogeneity: High-Dimensional Panels and Measurement Error Models PDF

Cannot Refute

[59] Semiparametric causal inference for right-censored outcomes with many weak invalid instruments PDF

Cannot Refute

Contribution

Model-agnostic framework applicable to any machine learning backbone

[62] Meta-learning for estimating multiple treatment effects with imbalance PDF

Cannot Refute

[63] Differentially private learners for heterogeneous treatment effects PDF

Cannot Refute

[64] A meta-learner framework to estimate individualized treatment effects for survival outcomes PDF

Cannot Refute

[65] A meta-learner for heterogeneous effects in difference-in-differences PDF

Cannot Refute

[66] Black box causal inference: Effect estimation via meta prediction PDF

Cannot Refute

[67] When causality meets missing data: Fusing key information to bridge causal discovery and imputation in time series via bidirectional meta-learning PDF

Cannot Refute

[68] Learning to infer counterfactuals: meta-learning for estimating multiple imbalanced treatment effects PDF

Cannot Refute

[69] Meta-learning for heterogeneous treatment effect estimation with closed-form solvers PDF

Cannot Refute

[70] Hybrid Meta-learners for Estimating Heterogeneous Treatment Effects PDF

Cannot Refute

[71] A tutorial introduction to heterogeneous treatment effect estimation with meta-learners PDF

Cannot Refute

Overlap-weighted orthogonal meta-learner for treatment effect estimation over time

Overview

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Overlap-weighted orthogonal meta-learner for time-varying treatment effect estimation

[61] Orthogonal Survival Learners for Estimating Heterogeneous Treatment Effects from Time-to-Event Data PDF

Neyman-orthogonal population risk function for weighted oracle risk minimization

[51] Orthogonal statistical learning PDF

[55] An introduction to double/debiased machine learning PDF

[60] Double/debiased machine learning for treatment and structural parameters PDF

[52] Automatic debiased machine learning for smooth functionals of nonparametric m-estimands PDF

[53] Estimation and inference for causal functions with multiway clustered data PDF

[54] Orthogonal random forest for causal inference PDF

[56] GMM with Many Weak Moment Conditions and Nuisance Parameters: General Theory and Applications to Causal Inference PDF

[57] Debiased Maximum Likelihood Estimators of Hazard Ratios Under Kernel-Based Machine Learning Adjustment PDF

[58] Debiased Machine Learning for Unobserved Heterogeneity: High-Dimensional Panels and Measurement Error Models PDF

[59] Semiparametric causal inference for right-censored outcomes with many weak invalid instruments PDF

Model-agnostic framework applicable to any machine learning backbone

[62] Meta-learning for estimating multiple treatment effects with imbalance PDF

[63] Differentially private learners for heterogeneous treatment effects PDF

[64] A meta-learner framework to estimate individualized treatment effects for survival outcomes PDF

[65] A meta-learner for heterogeneous effects in difference-in-differences PDF

[66] Black box causal inference: Effect estimation via meta prediction PDF

[67] When causality meets missing data: Fusing key information to bridge causal discovery and imputation in time series via bidirectional meta-learning PDF

[68] Learning to infer counterfactuals: meta-learning for estimating multiple imbalanced treatment effects PDF

[69] Meta-learning for heterogeneous treatment effect estimation with closed-form solvers PDF

[70] Hybrid Meta-learners for Estimating Heterogeneous Treatment Effects PDF

[71] A tutorial introduction to heterogeneous treatment effect estimation with meta-learners PDF

Table of Contents