Distributionally Robust Linear Regression with Block Lewis Weights

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

distributionally robust optimizationlinear regressionaccelerationconvex geometry

We present an algorithm for the empirical group distributionally robust (GDR) least squares problem. Given $m$ groups, a parameter vector in $\mathbb{R}^d$ , and stacked design matrices and responses $\mathbf{A}$ and $\bm{b}$ , our algorithm obtains a $(1+\varepsilon)$ -multiplicative optimal solution using $\widetilde{O}(\min\{\mathsf{rank}(\mathbf{A}),m\}^{1/3}\varepsilon^{-2/3})$ linear-system-solves of matrices of the form $\mathbf{A}^{\top}\mathbf{B}\mathbf{A}$ for block-diagonal $\mathbf{B}$ . Our technical methods follow from a recent geometric construction, block Lewis weights, that relates the empirical GDR problem to a carefully chosen least squares problem and an application of accelerated proximal methods. Our algorithm improves over known interior point methods for moderate accuracy regimes and matches the state-of-the-art guarantees for the special case of $\ell_{\infty}$ regression. We also give algorithms that smoothly interpolate between minimizing the average least squares loss and the distributionally robust loss.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper contributes an algorithm for empirical group distributionally robust least squares that achieves approximate optimality using a number of linear-system solves dependent on rank or group count and accuracy. It resides in the 'Specialized Optimization Techniques' leaf under 'Algorithmic Approaches and Computational Methods', which contains only two papers total. This leaf represents a relatively sparse research direction focused on novel optimization methods beyond standard stochastic gradient approaches, suggesting the paper enters a less crowded algorithmic niche within the broader group-DRO landscape.

The taxonomy reveals that most algorithmic work clusters in 'Stochastic and Variance-Reduced Algorithms' (three papers), while the paper's leaf emphasizes structural exploitation through block Lewis weights and proximal methods. Neighboring branches include 'Core Group Distributionally Robust Optimization Frameworks' with multiple formulation-focused papers and 'Regularization and Penalty-Based Robust Methods' addressing robustness through penalties. The scope note for the paper's leaf explicitly excludes standard stochastic gradient methods, positioning this work as an alternative computational pathway that leverages geometric constructions rather than variance reduction or sampling strategies.

Among seventeen candidates examined, the contribution-level analysis shows mixed novelty signals. The core algorithm for group-DRO least squares examined six candidates with none providing clear refutation, suggesting this specific solver design may be relatively fresh. The block Lewis weights construction examined only one candidate without refutation, indicating limited prior work in this geometric direction. However, the interpolation algorithm between average and robust objectives examined ten candidates and found one refutable match, implying this aspect has more substantial overlap with existing methods for balancing robustness and average performance.

Based on the limited search scope of seventeen semantically similar papers, the work appears to introduce a distinct algorithmic approach in a sparsely populated taxonomy leaf, though the interpolation component shows clearer precedent. The analysis does not cover the full literature on leverage-score sampling or proximal methods outside the group-DRO context, so the geometric construction's novelty may extend beyond what these top-K matches reveal.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Distributionally robust linear regression with multiple groups seeks to learn models that perform well across heterogeneous subpopulations, guarding against worst-case group performance rather than optimizing average loss. The field's structure reflects several complementary perspectives. Core Group Distributionally Robust Optimization Frameworks establish the foundational min-max formulations and uncertainty sets, while Algorithmic Approaches and Computational Methods develop practical solvers—ranging from stochastic gradient techniques to specialized optimization routines that exploit problem structure. Regularization and Penalty-Based Robust Methods incorporate shrinkage or group-specific penalties to stabilize estimates, and Theoretical Foundations provide statistical guarantees on convergence and generalization. Multi-Source and Transfer Learning Extensions address settings where data arrive from distinct domains or tasks, and Distributed and Federated Robust Learning tackle scenarios with decentralized data. Application-Specific Robust Learning branches into domains such as genomics and fairness-aware prediction, while Alternative Robust Regression Approaches explore non-group-based robustness paradigms. Recent work has intensified around efficient algorithms that scale to many groups and high dimensions, balancing computational cost with strong worst-group guarantees. For instance, Efficient Group DRO[4] and Near-Optimal Group DRO[33] push the frontier of sample complexity and runtime, while methods like Probabilistic Group DRO[10] and Group Level Uncertainty[3] incorporate uncertainty quantification over group memberships. Block Lewis Weights[0] sits within the Specialized Optimization Techniques cluster, proposing a novel leverage-score-based sampling scheme to accelerate robust regression solvers. This approach contrasts with gradient-based methods such as Multi-Fractional Gradient[31], which also targets computational efficiency but through fractional-order updates. By exploiting block structure in the design matrix, Block Lewis Weights[0] offers a distinct algorithmic angle that complements existing stochastic and bilevel optimization strategies, addressing the persistent challenge of scaling robust group-aware regression to large, structured datasets.

Claimed Contributions

Algorithm for empirical group distributionally robust least squares

6 retrieved papers

The authors introduce an algorithm that solves the group distributionally robust regression problem with iteration complexity that depends on the minimum of the matrix rank and the number of groups, improving over standard interior point methods in moderate accuracy regimes and matching state-of-the-art for l∞ regression.

6 retrieved papers

Block Lewis weights for distributionally robust optimization

1 retrieved paper

The authors employ block Lewis weights, a geometric construction, to relate the distributionally robust problem to a structured least squares problem. This enables the design of their algorithm with improved iteration complexity by choosing an appropriate geometry for proximal subproblems.

1 retrieved paper

Algorithm for interpolating between average and robust objectives

Can Refute

10 retrieved papers

The authors provide an algorithm (Algorithm 5, Theorem 2) that optimizes a family of objectives parameterized by p that interpolate between the utilitarian (average loss) and egalitarian (robust) approaches, allowing smooth trade-offs between utility and robustness.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[31] Multi-Fractional Gradient Descent: A Novel Approach to Gradient Descent for Robust Linear Regression PDF

Robab Kalantari, Khashayar Rahimi, Saman Naderi Mezajin (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Algorithm for empirical group distributionally robust least squares

[3] Group distributionally robust machine learning under group level distributional uncertainty PDF

Cannot Refute

[33] Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond PDF

Cannot Refute

[39] Stochastic approximation approaches to group distributionally robust optimization PDF

Cannot Refute

[40] Group Distributionally Robust Optimization with Flexible Sample Queries PDF

Cannot Refute

[41] On Regularized System Identification from a Martingale Distributional Robustness Perspective PDF

Cannot Refute

[42] Revisiting Group Robustness: Class-specific Scaling is All You Need PDF

Cannot Refute

Contribution

Block Lewis weights for distributionally robust optimization

[15] Per-Group Distributionally Robust Optimization (Per-GDRO) with Learnable Ambiguity Set Sizes via Bilevel Optimization PDF

Cannot Refute

Contribution

Algorithm for interpolating between average and robust objectives

[46] Tilted empirical risk minimization PDF

Can Refute

[43] A Stochastic Algorithm for Sinkhorn Distance-Regularized Distributionally Robust Optimization PDF

Cannot Refute

[44] Distributionally robust optimization and robust statistics PDF

Cannot Refute

[45] Model Predictive Task Sampling for Efficient and Robust Adaptation PDF

Cannot Refute

[47] Stochastic gradient methods for distributionally robust optimization with f-divergences PDF

Cannot Refute

[48] Statistical analysis of Wasserstein distributionally robust estimators PDF

Cannot Refute

[49] Reliable machine learning via distributional robustness PDF

Cannot Refute

[50] Learning against distributional uncertainty: On the trade-off between robustness and specificity PDF

Cannot Refute

[51] Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond PDF

Cannot Refute

[52] Sensitivity analysis under the $f$ -sensitivity models: a distributional robustness perspective PDF

Cannot Refute

Distributionally Robust Linear Regression with Block Lewis Weights

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[31] Multi-Fractional Gradient Descent: A Novel Approach to Gradient Descent for Robust Linear Regression PDF

Contribution Analysis

Algorithm for empirical group distributionally robust least squares

[3] Group distributionally robust machine learning under group level distributional uncertainty PDF

[33] Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond PDF

[39] Stochastic approximation approaches to group distributionally robust optimization PDF

[40] Group Distributionally Robust Optimization with Flexible Sample Queries PDF

[41] On Regularized System Identification from a Martingale Distributional Robustness Perspective PDF

[42] Revisiting Group Robustness: Class-specific Scaling is All You Need PDF

Block Lewis weights for distributionally robust optimization

[15] Per-Group Distributionally Robust Optimization (Per-GDRO) with Learnable Ambiguity Set Sizes via Bilevel Optimization PDF

Algorithm for interpolating between average and robust objectives

[46] Tilted empirical risk minimization PDF

[43] A Stochastic Algorithm for Sinkhorn Distance-Regularized Distributionally Robust Optimization PDF

[44] Distributionally robust optimization and robust statistics PDF

[45] Model Predictive Task Sampling for Efficient and Robust Adaptation PDF

[47] Stochastic gradient methods for distributionally robust optimization with f-divergences PDF

[48] Statistical analysis of Wasserstein distributionally robust estimators PDF

[49] Reliable machine learning via distributional robustness PDF

[50] Learning against distributional uncertainty: On the trade-off between robustness and specificity PDF

[51] Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond PDF

[52] Sensitivity analysis under the fff-sensitivity models: a distributional robustness perspective PDF

Table of Contents

[52] Sensitivity analysis under the $f$ -sensitivity models: a distributional robustness perspective PDF