Diverse and Sparse Mixture-of-Experts for Causal Subgraph–Based Out-of-Distribution Graph Learning
Overview
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors derive a formal OOD risk bound that decomposes error into coverage and selection terms, proving that semantic diversity among experts (ensuring coverage of causal mechanisms) and instance-level sparsity in gating (enabling correct expert selection) together reduce out-of-distribution generalization error.
The authors propose a practical MoE architecture where experts extract diverse causal subgraphs using a decorrelation regularizer and a learned gating network adaptively selects relevant experts, avoiding the need for environment labels or restrictive causal independence assumptions common in prior work.
The authors introduce a MoE framework specifically designed to handle instance-level heterogeneity in causal subgraphs by allowing multiple experts to generate distinct causal hypotheses, with sparse gating adaptively focusing on the most relevant experts for each input graph.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
Theoretical justification for MoE in graph OOD learning via risk bound decomposition
The authors derive a formal OOD risk bound that decomposes error into coverage and selection terms, proving that semantic diversity among experts (ensuring coverage of causal mechanisms) and instance-level sparsity in gating (enabling correct expert selection) together reduce out-of-distribution generalization error.
[54] Topology-Informed Robust Optimization for Out-of-Distribution Generalization PDF
[55] Sparse mixture-of-experts are domain generalizable learners PDF
[56] Sharp Analysis of Out-of-Distribution Error for âImportance-Weightedâ Estimators in the Overparameterized Regime PDF
[57] Not eliminate but aggregate: Post-hoc control over mixture-of-experts to address shortcut shifts in natural language understanding PDF
[58] CBDMoE: Consistent-but-Diverse Mixture of Experts for Domain Generalization PDF
[59] CrossGAP: Unified Face Anti-Spoofing via Cross-Modal Global-Aware Prompting PDF
[60] Mixture Data for Training Cannot Ensure Out-of-distribution Generalization PDF
[61] Accuracy on the wrong line: On the pitfalls of noisy data for OOD generalisation PDF
[62] Bridging the Theoretical Bound and Deep Algorithms for Open Set Domain Adaptation PDF
[63] Hmoe: Hypernetwork-based mixture of experts for domain generalization PDF
Causal subgraph-based MoE framework without environment labels or strong causal assumptions
The authors propose a practical MoE architecture where experts extract diverse causal subgraphs using a decorrelation regularizer and a learned gating network adaptively selects relevant experts, avoiding the need for environment labels or restrictive causal independence assumptions common in prior work.
[53] Distribution Shift Resilient GNN via Mixture of Aligned Experts PDF
[51] Learning latent causal graphs via mixture oracles PDF
[52] Graphing the Truth: Harnessing Causal Insights for Advanced Multimodal Fake News Detection PDF
Novel Mixture-of-Experts framework for modeling heterogeneous causal subgraphs at instance level
The authors introduce a MoE framework specifically designed to handle instance-level heterogeneity in causal subgraphs by allowing multiple experts to generate distinct causal hypotheses, with sparse gating adaptively focusing on the most relevant experts for each input graph.