Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.7 Download Report PDF

Spurious CorrelationSubpopulation ShiftGroup Distributionally Robust OptimizationWasserstein Distributionally Robust Optimization

Conventional supervised learning methods are often vulnerable to spurious correlations, particularly under distribution shifts in test data. To address this issue, several approaches, most notably Group DRO, have been developed. While these methods are highly robust to subpopulation or group shifts, they remain vulnerable to intra-group distributional shifts, which frequently occur in minority groups with limited samples. We propose a hierarchical extension of Group DRO that addresses both inter-group and intra-group uncertainties, providing robustness to distribution shifts at multiple levels. We also introduce new benchmark settings that simulate realistic minority group distribution shifts—an important yet previously underexplored challenge in spurious correlation research. Our method demonstrates strong robustness under these conditions—where existing robust learning methods consistently fail—while also achieving superior performance on standard benchmarks. These results highlight the importance of broadening the ambiguity set to better capture both inter-group and intra-group distributional uncertainties.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a hierarchical extension of Group DRO that addresses both inter-group and intra-group distributional uncertainties, aiming to improve robustness under distribution shifts at multiple levels. It resides in the 'Hierarchical Ambiguity Set and Multi-Granular Decomposition' leaf, which contains only three papers total, including this one. This leaf sits within the broader 'Hierarchical and Multi-Level Robustness Frameworks' branch, indicating a relatively sparse research direction focused on explicit hierarchical uncertainty modeling. The small number of sibling papers suggests this is an emerging rather than crowded area.

The taxonomy reveals several neighboring directions: 'Hierarchical Feature Disentanglement and Representation Learning' (three papers) focuses on separating domain-related from invariant features, while 'Causal Inference and Invariance Learning' pursues stable predictors through causal mechanisms. The paper's hierarchical ambiguity set approach contrasts with causal disentanglement methods that learn invariances end-to-end, and differs from temporal dependency modeling that addresses evolving distributions over time. Its formal optimization framework distinguishes it from more detection-oriented approaches like zero-shot spurious correlation methods in adjacent branches.

Among 24 candidates examined across three contributions, the hierarchical ambiguity set contribution (4 candidates examined) shows no clear refutation, suggesting relative novelty in this specific formulation. However, the tractable minimax optimization algorithm (10 candidates, 2 refutable) and new benchmark settings for minority-group shifts (10 candidates, 1 refutable) encounter more substantial prior work. The limited search scope means these statistics reflect top-K semantic matches rather than exhaustive coverage. The core hierarchical framework appears more distinctive than its algorithmic implementation or evaluation protocols.

Based on the 24-candidate search, the work occupies a sparsely populated research direction with limited direct competition in hierarchical ambiguity sets. The analysis captures semantic neighbors and citation-expanded papers but cannot claim comprehensive field coverage. The hierarchical uncertainty modeling appears relatively novel, while the optimization techniques and benchmarking contributions face more overlap with existing literature within the examined scope.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Mitigating spurious correlations under distribution shifts with hierarchical robustness. The field addresses the challenge of building models that remain reliable when test distributions differ from training data, particularly when spurious features mislead standard learning. The taxonomy organizes approaches into several main branches: Hierarchical and Multi-Level Robustness Frameworks decompose the problem across granularities or nested uncertainty sets; Causal Inference and Invariance Learning seeks stable predictors by identifying invariant causal mechanisms; Temporal Dependency and Dynamic Distribution Modeling handles shifts that evolve over time; Meta-Learning and Generalization Under Distribution Shifts trains models to adapt quickly across diverse scenarios; Domain-Adaptive Detection and Classification tailors representations to new domains; and Sparse and Flexible Model Design for Robustness emphasizes architectures that avoid overfitting to spurious cues. Together, these branches reflect a spectrum from explicit causal reasoning to adaptive learning strategies, each targeting robustness from a different angle. Recent work highlights contrasts between methods that impose hierarchical structure versus those that learn invariances end-to-end. For instance, Hierarchical Ambiguity Sets[0] and HQD-EM[12] both leverage multi-granular decompositions to manage uncertainty at different levels, offering principled ways to balance worst-case robustness with empirical performance. Meanwhile, Zero-Shot Spurious Correlations[1] explores detecting and mitigating spurious features without retraining, and Graph Causal Ensembles[4] and Causal Disentanglement Generalization[8] pursue causal structures to isolate invariant predictors. The original paper, Hierarchical Ambiguity Sets[0], sits squarely within the hierarchical robustness branch, emphasizing nested ambiguity sets to systematically address shifts at multiple scales. Compared to neighbors like HQD-EM[12], which also decomposes distributions hierarchically, Hierarchical Ambiguity Sets[0] appears to focus more on formal optimization frameworks, while Zero-Shot Spurious Correlations[1] takes a more detection-oriented stance. This positioning underscores an ongoing tension between structured, theory-driven approaches and flexible, data-driven adaptation strategies.

Claimed Contributions

Hierarchical ambiguity set for distributionally robust optimization

4 retrieved papers

The authors introduce a hierarchical extension of Group DRO that models distributional uncertainty at two levels: inter-group shifts (changes in group proportions) and intra-group shifts (within-group distributional variations). This framework uses a Wasserstein-distance-based formulation to provide robustness to distribution shifts at multiple levels, particularly for minority groups with limited samples.

4 retrieved papers

Tractable minimax optimization algorithm

Can Refute

10 retrieved papers

The authors reformulate the hierarchical DRO problem into a tractable surrogate objective and provide an iterative coordinate-wise training procedure that alternates between updating semantic variables, group weights, and model parameters. This makes the framework computationally feasible for practical applications.

10 retrieved papers

Can Refute

New benchmark settings for minority-group distribution shifts

Can Refute

10 retrieved papers

The authors construct modified versions of standard benchmarks (CMNIST, Waterbirds, CelebA) that simulate realistic intra-group distributional shifts in minority groups by altering train-test splits. These settings expose a critical failure mode where existing robust learning methods perform poorly, while their proposed method maintains strong performance.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[1] Mitigating Spurious Correlations in Zero-Shot Multimodal Models PDF

S Lu, J Chai, X Wang (2025)

[12] HQD-EM: Robust VQA Through Hierarchical Question Decomposition Bias Module and Ensemble Adaptive Angular Margin Loss PDF

Seungha Noh, Jae Won Cho, Seong Hyeon Noh, Jae-Won Cho (2025) • Mathematics

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Hierarchical ambiguity set for distributionally robust optimization

[18] Input uncertainty in stochastic simulation PDF

Cannot Refute

[19] Group distributionally robust reinforcement learning with hierarchical latent variables PDF

Cannot Refute

[20] Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond PDF

Cannot Refute

[21] Per-Group Distributionally Robust Optimization (Per-GDRO) with Learnable Ambiguity Set Sizes via Bilevel Optimization PDF

Cannot Refute

Contribution

Tractable minimax optimization algorithm

[32] Stochastic approximation approaches to group distributionally robust optimization PDF

Can Refute

[34] Cooperative data-driven distributionally robust optimization PDF

Can Refute

[33] Discrete approximation scheme in distributionally robust optimization PDF

Cannot Refute

[35] Efficient operator-splitting minimax algorithm for robust optimization. PDF

Cannot Refute

[36] Nonlinear distributionally robust optimization PDF

Cannot Refute

[37] Flow-Based Distributionally Robust Optimization PDF

Cannot Refute

[38] Robust Bond Portfolio Construction via ConvexâConcave Saddle Point Optimization PDF

Cannot Refute

[39] Learning distributionally robust tractable probabilistic models in continuous domains PDF

Cannot Refute

[40] Distributionally Robust Optimization with Bias and Variance Reduction PDF

Cannot Refute

[41] Efficient Algorithms for Distributionally Robust Stochastic Optimization with Discrete Scenario Support PDF

Cannot Refute

Contribution

New benchmark settings for minority-group distribution shifts

[28] Change is hard: A closer look at subpopulation shift PDF

Can Refute

[22] Distributionally robust losses for latent covariate mixtures PDF

Cannot Refute

[23] Generative models improve fairness of medical classifiers under distribution shifts PDF

Cannot Refute

[24] Class-conditional distribution balancing for group robust classification PDF

Cannot Refute

[25] Group distributionally robust machine learning under group level distributional uncertainty PDF

Cannot Refute

[26] Improving subgroup robustness via data selection PDF

Cannot Refute

[27] Robust image representations with counterfactual contrastive learning PDF

Cannot Refute

[29] Subgroups Matter for Robust Bias Mitigation PDF

Cannot Refute

[30] Bias and generalizability of foundation models across datasets in breast mammography PDF

Cannot Refute

[31] R-index: a standardized representativeness metric for benchmarking diversity, equity, and inclusion in biopharmaceutical clinical trial development PDF

Cannot Refute

Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] Mitigating Spurious Correlations in Zero-Shot Multimodal Models PDF

[12] HQD-EM: Robust VQA Through Hierarchical Question Decomposition Bias Module and Ensemble Adaptive Angular Margin Loss PDF

Contribution Analysis

Hierarchical ambiguity set for distributionally robust optimization

[18] Input uncertainty in stochastic simulation PDF

[19] Group distributionally robust reinforcement learning with hierarchical latent variables PDF

[20] Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond PDF

[21] Per-Group Distributionally Robust Optimization (Per-GDRO) with Learnable Ambiguity Set Sizes via Bilevel Optimization PDF

Tractable minimax optimization algorithm

[32] Stochastic approximation approaches to group distributionally robust optimization PDF

[34] Cooperative data-driven distributionally robust optimization PDF

[33] Discrete approximation scheme in distributionally robust optimization PDF

[35] Efficient operator-splitting minimax algorithm for robust optimization. PDF

[36] Nonlinear distributionally robust optimization PDF

[37] Flow-Based Distributionally Robust Optimization PDF

[38] Robust Bond Portfolio Construction via ConvexâConcave Saddle Point Optimization PDF

[39] Learning distributionally robust tractable probabilistic models in continuous domains PDF

[40] Distributionally Robust Optimization with Bias and Variance Reduction PDF

[41] Efficient Algorithms for Distributionally Robust Stochastic Optimization with Discrete Scenario Support PDF

New benchmark settings for minority-group distribution shifts

[28] Change is hard: A closer look at subpopulation shift PDF

[22] Distributionally robust losses for latent covariate mixtures PDF

[23] Generative models improve fairness of medical classifiers under distribution shifts PDF

[24] Class-conditional distribution balancing for group robust classification PDF

[25] Group distributionally robust machine learning under group level distributional uncertainty PDF

[26] Improving subgroup robustness via data selection PDF

[27] Robust image representations with counterfactual contrastive learning PDF

[29] Subgroups Matter for Robust Bias Mitigation PDF

[30] Bias and generalizability of foundation models across datasets in breast mammography PDF

[31] R-index: a standardized representativeness metric for benchmarking diversity, equity, and inclusion in biopharmaceutical clinical trial development PDF

Table of Contents

[38] Robust Bond Portfolio Construction via ConvexâConcave Saddle Point Optimization PDF