A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Orthogonality Constraints; Nonconvex Optimization; Nonsmooth Composite Optimization; Block Coordinate Descent; Convergence Analysis

Nonsmooth composite optimization with orthogonality constraints has a wide range of applications in statistical learning and data science. However, this problem is challenging due to its nonsmooth objective and computationally expensive, non-convex constraints. In this paper, we propose a new approach called \textbf{OBCD}, which leverages Block Coordinate Descent to address these challenges. \textbf{OBCD} is a feasible method with a small computational footprint. In each iteration, it updates $k$ rows of the solution matrix, where $k \geq 2$ , by globally solving a small nonsmooth optimization problem under orthogonality constraints. We prove that the limiting points of \textbf{OBCD}, referred to as (global) block- $k$ stationary points, offer stronger optimality than standard critical points. Furthermore, we show that \textbf{OBCD} converges to $\epsilon$ -block- $k$ stationary points with an iteration complexity of $\mathcal{O}(1/\epsilon)$ . Additionally, under the Kurdyka-Lojasiewicz (KL) inequality, we establish the non-ergodic convergence rate of \textbf{OBCD}. We also demonstrate how novel breakpoint search methods can be used to solve the subproblem in \textbf{OBCD}. Empirical results show that our approach consistently outperforms existing methods.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes OBCD, a block coordinate descent algorithm for nonsmooth composite optimization under orthogonality constraints, updating k rows per iteration by solving small subproblems. According to the taxonomy, it resides in the 'Block Coordinate and Specialized Descent Methods' leaf, which contains only three papers total. This places the work in a relatively sparse research direction compared to the more crowded 'ADMM-Based Approaches' (four papers) and 'Proximal Gradient Methods' (two papers) within the broader 'Proximal and Splitting Methods' branch, suggesting the coordinate descent angle is less explored in this field.

The taxonomy tree shows that neighboring leaves include ADMM-based and proximal gradient methods under the 'Proximal and Splitting Methods' branch, as well as smoothing techniques in a sibling branch. OBCD diverges from ADMM frameworks by avoiding alternating direction splitting and instead partitions variables row-wise, maintaining feasibility throughout. It also differs from smoothing approaches, which replace nonsmooth terms with differentiable surrogates, by directly handling nonsmoothness in each subproblem. The scope notes clarify that coordinate descent methods are excluded from proximal and ADMM categories, reinforcing OBCD's distinct methodological niche within the taxonomy structure.

Among the three contributions analyzed, the first two—OBCD algorithm and block-k stationarity—show no refutable candidates across ten examined papers each, suggesting these aspects are relatively novel within the limited search scope. The third contribution, convergence analysis with iteration complexity and non-ergodic rates, examined ten candidates and found five refutable pairs, indicating more substantial prior work on convergence guarantees in this setting. This pattern suggests the algorithmic framework and stationarity concept are less anticipated by the thirty candidates examined, while convergence analysis builds on established techniques.

Based on the top-30 semantic matches and taxonomy structure, the work appears to introduce a distinct algorithmic approach in a sparse research direction, with the core method and stationarity concept showing limited overlap. The convergence analysis, however, aligns with existing theoretical frameworks. The limited search scope means these observations reflect the examined candidates rather than an exhaustive field survey, and the sparse taxonomy leaf suggests fewer direct competitors in the coordinate descent paradigm for this problem class.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: nonsmooth composite optimization under orthogonality constraints. This field addresses problems where one seeks to minimize a composite objective—often combining a smooth loss with a nonsmooth regularizer—while requiring the decision variables to lie on a Stiefel manifold or satisfy other orthogonality constraints. The taxonomy reveals a rich landscape organized around several complementary strategies. Proximal and splitting methods, including ADMM-based approaches, decompose the problem to handle nonsmoothness and constraints separately, often alternating between subproblems that respect the manifold geometry. Smoothing and approximation methods replace nonsmooth terms with differentiable surrogates, enabling the use of Riemannian gradient techniques on the Stiefel manifold. Subgradient and penalty methods directly tackle nonsmoothness via generalized derivatives or penalize constraint violations, while block coordinate and specialized descent schemes exploit problem structure by updating variables in a partitioned fashion. Relaxation and approximation schemes offer tractable surrogates when exact enforcement is costly, and theoretical foundations provide stationarity concepts and convergence guarantees that underpin these algorithmic choices. Applications span tensor decomposition, clustering, and signal processing, and metaheuristic approaches offer global search alternatives when local methods struggle. Recent work has intensified around ADMM variants and proximal gradient schemes that respect manifold structure. For instance, ADMM Orthogonality Constraints[1] and Riemannian ADMM[7] extend classical splitting to curved constraint sets, while Proximal Gradient Stiefel[4] and Proximal Gradient Stiefel[6] develop manifold-aware proximal operators. Smoothing techniques such as Smoothing Stiefel Manifold[5] and Smoothing Stiefel Minimization[8] trade exactness for differentiability, facilitating faster Riemannian optimization. Within this landscape, Block Coordinate Descent Orthogonality[0] occupies a niche that blends coordinate-wise updates with orthogonality enforcement, offering a middle ground between full splitting (as in ADMM Orthogonality Constraints[1]) and global manifold descent (as in Proximal Gradient Stiefel[4]). By partitioning variables and cycling through blocks, it can exploit sparsity or separability while maintaining feasibility, a strategy that complements the more monolithic ADMM frameworks and the smoothing-based methods that prioritize gradient flow over decomposition.

Claimed Contributions

OBCD algorithm for nonsmooth composite optimization under orthogonality constraints

10 retrieved papers

The authors introduce OBCD, a block coordinate descent algorithm that updates k rows of the solution matrix per iteration by solving a small nonsmooth optimization problem under orthogonality constraints. The method maintains feasibility and has low per-iteration computational cost.

10 retrieved papers

Block-k stationary points with stronger optimality guarantees

10 retrieved papers

The authors define a new optimality condition called block-k stationary points and prove that these points provide stronger optimality guarantees than standard critical points used in existing methods for optimization on the Stiefel manifold.

10 retrieved papers

Convergence analysis with iteration complexity and non-ergodic rates

Can Refute

10 retrieved papers

The authors establish that OBCD achieves ε-block-k stationary points with O(1/ε) iteration complexity and derive non-ergodic (last-iterate) convergence rates under the Kurdyka-Lojasiewicz inequality, including finite-step, linear, and sublinear convergence depending on the KL exponent.

10 retrieved papers

Cannot Refute

Contribution

Block-k stationary points with stronger optimality guarantees

[2] A constraint dissolving approach for nonsmooth optimization over the Stiefel manifold PDF

Cannot Refute

[5] Smoothing algorithms for nonsmooth optimization over the Stiefel manifold with applications to the graph Fourier basis problem PDF

Cannot Refute

[55] An extrapolated iteratively reweighted method with complexity analysis PDF

Cannot Refute

A Block Coordinate Descent Method for Nonsmooth Composite Optimization under Orthogonality Constraints

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[1] ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints PDF

[7] A Riemannian alternating direction method of multipliers PDF

[25] Nonlinear Eigen-approach ADMM for Sparse Optimization on Stiefel Manifold PDF

Contribution Analysis

OBCD algorithm for nonsmooth composite optimization under orthogonality constraints

[16] Orthogonal Constrained Minimization with Tensor Regularization for HSI Denoising and Destriping PDF

[31] An Augmented Lagrangian Method for -Regularized Optimization Problems with Orthogonality Constraints PDF

[32] Coordinate projected gradient descent minimization and its application to orthogonal nonnegative matrix factorization PDF

[33] A coordinate gradient descent method for â1-regularized convex minimization PDF

[34] Exploiting Sensing Signal in ISAC: A NOMA Inspired Scheme PDF

[35] Large scale composite optimization problems with coupled objective functions: theory, algorithms and applications PDF

[36] Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization PDF

[37] Provably Convergent Plug-and-play Proximal Block Coordinate Descent Method for Hyperspectral Anomaly Detection PDF

[38] Orthogonal Constrained Minimization with Tensor â2,p Regularization for HSI Denoising and Destriping PDF

[39] A Block Coordinate Descent-based Projected Gradient Algorithm for Orthogonal Non-negative Matrix Factorization PDF

Block-k stationary points with stronger optimality guarantees

[2] A constraint dissolving approach for nonsmooth optimization over the Stiefel manifold PDF

[5] Smoothing algorithms for nonsmooth optimization over the Stiefel manifold with applications to the graph Fourier basis problem PDF

[6] Proximal gradient method for nonsmooth optimization over the Stiefel manifold PDF

[17] A Penalty-Free Infeasible Approach for a Class of Nonsmooth Optimization Problems Over the Stiefel Manifold PDF

[40] Riemannian stochastic proximal gradient methods for nonsmooth optimization over the Stiefel manifold PDF

[41] A Riemannian conjugate gradient method for optimization on the Stiefel manifold PDF

[42] Exact Penalty Function for Norm Minimization over the Stiefel Manifold PDF

[43] A semidefinite relaxation for sums of heterogeneous quadratics on the Stiefel manifold PDF

[44] Proximal point algorithm on the Stiefel manifold PDF

[45] An adaptive regularized proximal Newton-type methods for composite optimization over the Stiefel manifold PDF

Convergence analysis with iteration complexity and non-ergodic rates

[46] On the convergence of block coordinate descent type methods PDF

[47] A globally convergent algorithm for nonconvex optimization based on block coordinate update PDF

[49] Convergence analysis of block majorize-minimize subspace approach PDF

[52] Analysis of a variable metric block coordinate method under proximal errors PDF

[54] Let's make block coordinate descent converge faster: faster greedy rules, message-passing, active-set complexity, and superlinear convergence PDF

[48] Accelerated, parallel, and proximal coordinate descent PDF

[50] Convergence of three-block ADMM for weakly convex optimization problems PDF

[51] Smoothing randomized block-coordinate proximal gradient algorithms for nonsmooth nonconvex composite optimization PDF

[53] Convergence of random reshuffling under the kurdykaâÅojasiewicz inequality PDF

[55] An extrapolated iteratively reweighted method with complexity analysis PDF

Table of Contents

[33] A coordinate gradient descent method for â1-regularized convex minimization PDF

[38] Orthogonal Constrained Minimization with Tensor â2,p Regularization for HSI Denoising and Destriping PDF

[53] Convergence of random reshuffling under the kurdykaâÅojasiewicz inequality PDF