Adaptive gradient descent on Riemannian manifolds and its applications to Gaussian variational inference
Overview
Overall Novelty Assessment
The paper proposes RAdaGD, a family of adaptive gradient descent methods for Riemannian manifolds that achieve non-ergodic O(1/k) convergence under local geodesic smoothness and generalized geodesic convexity. It resides in the Deterministic Adaptive Gradient Descent leaf, which contains only three papers total, including the original work. This leaf sits within the broader Core Adaptive Gradient Methods branch, indicating a relatively sparse research direction focused on deterministic settings with rigorous convergence guarantees, as opposed to the more crowded stochastic variants.
The taxonomy reveals that the paper's immediate neighbors include Gradient Lower Bounded and Adaptive Gradient Nonnegative Curvature, both emphasizing geometric regularity conditions for convergence. The sibling category Stochastic Adaptive Gradient Methods contains six papers addressing mini-batch and Adam-like algorithms, reflecting a more active research direction. Nearby branches such as Second-Order Methods and Energy-Adaptive Methods explore alternative geometric frameworks, while Specialized Problem Formulations address bilevel and minimax settings. The deterministic leaf's scope explicitly excludes stochastic variants and variance reduction, positioning this work within a narrower but theoretically focused niche.
Among twenty candidates examined, the Gaussian Variational Inference contribution shows one refutable candidate, suggesting prior work addresses convergence without L-smoothness under certain conditions. The RAdaGD algorithm itself examined four candidates with zero refutations, indicating potential novelty in its adaptive step-size mechanism. The local geodesic smoothness framework examined ten candidates without clear refutation, though the limited search scope means broader prior work may exist. These statistics reflect a targeted semantic search rather than exhaustive coverage, so the apparent novelty should be interpreted cautiously.
Based on the limited search of twenty candidates, the work appears to occupy a sparsely populated deterministic niche within a broader field that increasingly emphasizes stochastic methods. The taxonomy structure suggests the deterministic adaptive gradient direction receives less attention than stochastic counterparts, though the search scope does not capture the full landscape of Riemannian optimization literature.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce RAdaGD, a family of line-search-free adaptive gradient descent algorithms for Riemannian optimization. These methods automatically tune step sizes and achieve a non-ergodic convergence rate of O(1/k) under local geodesic smoothness and generalized geodesic convexity, which is claimed to be the first such rate for Riemannian adaptive methods.
The authors apply RAdaGD to Gaussian Variational Inference and claim to provide the first algorithm with provable convergence guarantees when the target log-density is not globally L-smooth, requiring only a weaker growth condition and additional technical assumptions.
The authors establish that their convergence analysis relies on local geodesic smoothness rather than global L-smoothness, which broadens the class of applicable functions. They prove that every twice continuously differentiable function on a complete Riemannian manifold satisfies local geodesic smoothness.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[1] Adaptive Gradient Descent on Riemannian Manifolds with Nonnegative Curvature PDF
[49] Gradient method for optimization on Riemannian manifolds with lower bounded curvature PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
RAdaGD: Adaptive gradient descent on Riemannian manifolds
The authors introduce RAdaGD, a family of line-search-free adaptive gradient descent algorithms for Riemannian optimization. These methods automatically tune step sizes and achieve a non-ergodic convergence rate of O(1/k) under local geodesic smoothness and generalized geodesic convexity, which is claimed to be the first such rate for Riemannian adaptive methods.
[30] Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds PDF
[40] Adaptive Preconditioned Gradient Descent with Energy PDF
[57] A Riemannian Optimization Perspective of the Gauss-Newton Method for Feedforward Neural Networks PDF
[65] Generalized Steepest Descent Methods on Riemannian Manifolds and Hilbert Spaces: Convergence Analysis and Stochastic Extensions PDF
First convergence guarantee for GVI without L-smoothness
The authors apply RAdaGD to Gaussian Variational Inference and claim to provide the first algorithm with provable convergence guarantees when the target log-density is not globally L-smooth, requiring only a weaker growth condition and additional technical assumptions.
[59] Provable convergence guarantees for black-box variational inference PDF
[60] Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space PDF
[61] Towards Understanding the Dynamics of Gaussian-Stein Variational Gradient Descent PDF
[62] The computational asymptotics of Gaussian variational inference and the Laplace approximation PDF
[63] Decoupled variational Gaussian inference PDF
[64] Validated Variational Inference via Practical Posterior Error Bounds PDF
Local geodesic smoothness framework for broader function classes
The authors establish that their convergence analysis relies on local geodesic smoothness rather than global L-smoothness, which broadens the class of applicable functions. They prove that every twice continuously differentiable function on a complete Riemannian manifold satisfies local geodesic smoothness.