Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

Zeroth-Order OptimizationFederated OptimizationHessian

Zeroth-order (ZO) optimization enables dimension-free communication in federated learning (FL), making it attractive for fine-tuning of large language models (LLMs) due to significant communication savings. However, existing ZO-FL methods largely overlook curvature information, despite its well-established benefits for convergence acceleration. To address this, we propose HiSo, a Hessian-informed ZO federated optimization method that accelerates convergence by leveraging global diagonal Hessian approximations, while strictly preserving scalar-only communication without transmitting any second-order information. Theoretically, for non-convex functions, we show that HiSo can achieve an accelerated convergence rate that is independent of the Lipschitz constant $L$ and model dimension $d$ under some Hessian approximation assumptions, offering a plausible explanation for the observed phenomenon of ZO convergence being much faster than its worst-case $O(d)$ -bound. Empirically, across diverse LLM fine-tuning benchmarks, HiSo delivers a 1 $\sim$ 5× speedup in communication rounds over existing state-of-the-art ZO-FL baselines. This superior convergence not only cuts communication costs but also provides strong empirical evidence that Hessian information acts as an effective accelerator in federated ZO optimization settings.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes HiSo, a federated zeroth-order optimization method that leverages diagonal Hessian approximations to accelerate convergence while maintaining scalar-only communication. It occupies the 'Scalar-Only Communication Frameworks with Hessian Acceleration' leaf, which contains only three papers including this work. This represents a sparse research direction within the broader taxonomy of 23 papers, suggesting the intersection of dimension-free communication and Hessian-informed federated zeroth-order methods remains relatively unexplored compared to adjacent areas like centralized Hessian-aware methods or distributed consensus algorithms.

The taxonomy reveals neighboring directions including incremental Hessian estimation for federated zeroth-order optimization and Hessian approximation methods using compression or sketching. The original paper diverges from these by avoiding any second-order information transmission, contrasting with approaches like Hessian-weighted aggregation or eigenvector sharing that require richer communication primitives. Its closest structural neighbors are centralized Hessian-aware zeroth-order methods, which achieve similar convergence benefits but lack the federated communication constraints. The taxonomy boundaries clarify that HiSo sits at the intersection of federated learning efficiency demands and curvature exploitation, distinct from pure gradient-free methods without second-order awareness.

Among the three analyzed contributions, the core HiSo algorithm examined ten candidate papers, with two appearing to provide overlapping prior work. The dimension-independent convergence rate contribution examined eight candidates, with one potentially refuting its novelty claims. The generalized scalar-only communication framework examined only two candidates with no clear refutations. Given the limited search scope of twenty total candidates examined, these statistics suggest the HiSo algorithm and convergence analysis face more substantial prior work overlap than the communication framework abstraction. The small candidate pool indicates these findings reflect top-semantic-match proximity rather than exhaustive field coverage.

Based on examination of twenty semantically related papers, the work appears to occupy a relatively sparse research niche at the intersection of federated learning, zeroth-order optimization, and Hessian acceleration. The scalar-only communication framework shows less prior work overlap, while the algorithmic and theoretical contributions encounter more existing research. The analysis provides initial context but cannot definitively assess novelty given the constrained search scope and the field's evolving nature around communication-efficient federated optimization for large models.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Hessian-informed zeroth-order optimization in federated learning. The field combines gradient-free optimization with distributed learning, structured around four main branches. Federated Learning with Hessian-Informed Zeroth-Order Methods addresses privacy-preserving distributed training where clients cannot share gradients but can exploit curvature information, with works exploring scalar-only communication frameworks and Hessian-accelerated aggregation strategies like Hessian Weighted Aggregation[5]. Centralized Hessian-Aware Zeroth-Order Methods focus on single-machine settings where Hessian approximations improve convergence, exemplified by Hessian Aware Zeroth[1] and techniques for flat minima discovery such as Zeroth Order Flat Minima[2]. Distributed and Multi-Agent Zeroth-Order Optimization tackles coordination challenges in multi-agent systems, including consensus-based approaches like Zeroth Proximal Consensus[17]. Specialized Applications and Extensions cover domain-specific adaptations, from large language model fine-tuning in Hessian Zeroth LLM[7] to adversarial robustness in Hessian Adversarial Attack[23]. Recent activity centers on communication-efficient federated schemes and scalable Hessian approximations. A key tension emerges between methods that transmit full curvature information versus those achieving extreme communication reduction through scalar exchanges, as seen in Flecs[3] and HiSo[16]. Another active direction involves low-rank and subspace techniques like Low Rank Hessian[11] and Subspace Hessian Zeroth[13] that balance computational cost with second-order benefits. Hessian Federated Zeroth[0] sits within the scalar-only communication cluster, closely aligned with Hessian Scalar Communication[15] and HiSo[16], emphasizing minimal bandwidth overhead while preserving curvature-guided convergence. Compared to neighbors, it appears to prioritize practical federated deployment constraints over the richer but more communication-intensive strategies explored in works like Hessian Eigenvector Sharing[14], reflecting ongoing debates about the optimal trade-off between communication efficiency and convergence acceleration in privacy-sensitive distributed settings.

Claimed Contributions

Generalized scalar-only communication FL framework

2 retrieved papers

The authors introduce a generalized federated learning framework that decouples scalar-only communication from vanilla ZO-SGD, enabling integration of more sophisticated optimization algorithms while maintaining dimension-free communication. This framework extends beyond the limitations of prior work (DeComFL) by supporting various optimization techniques.

2 retrieved papers

HiSo algorithm for Hessian-informed federated ZO optimization

Can Refute

10 retrieved papers

The authors propose HiSo, a novel federated optimization method that leverages global diagonal Hessian approximations to accelerate convergence while strictly preserving scalar-only communication. The method captures curvature information without transmitting any Hessian-related data, achieving significant speedups over existing ZO-FL baselines.

10 retrieved papers

Can Refute

Dimension-independent convergence rate for non-convex federated ZO optimization

Can Refute

8 retrieved papers

The authors establish theoretical convergence guarantees showing that HiSo achieves a rate independent of both model dimension d and Lipschitz constant L under Hessian approximation assumptions. This represents the first dimension-independent convergence result for zeroth-order methods in federated learning and extends theoretical guarantees to multiple local updates.

8 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[15] Reconciling Hessian-Informed Acceleration and Scalar-Only Communication for Efficient Federated Zeroth-Order Fine-Tuning PDF

Li Zhe, Ying Bicheng, Zhe Li, Liu Zi-dong, Bicheng Ying, Dong, Chaosheng, Zidong Liu, Yang Hai-bo, Chaosheng Dong, Haibo Yang (2025)

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

Z Li, B Ying, Z Liu, C Dong, H Yang (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Generalized scalar-only communication FL framework

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

Cannot Refute

[29] Byzantine-resilient zero-order optimization for scalable federated fine-tuning of large language models PDF

Cannot Refute

Contribution

HiSo algorithm for Hessian-informed federated ZO optimization

[8] FedZeN: Quadratic Convergence in Zeroth-Order Federated Learning via Incremental Hessian Estimation PDF

Can Refute

[10] FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation PDF

Can Refute

[6] ZO-JADE: Zeroth-order curvature-aware distributed multi-agent convex optimization PDF

Cannot Refute

[7] Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer PDF

Cannot Refute

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

Cannot Refute

[24] Randomized Subspace Derivative-Free Optimization with Quadratic Models and Second-Order Convergence PDF

Cannot Refute

[25] Velocity-Free Distributed Optimization Algorithms for Second-Order Multiagent Systems PDF

Cannot Refute

[26] Distributed optimal resource allocation with secondâorder multiâAgent systems PDF

Cannot Refute

[27] Fine-grained theoretical analysis of federated zeroth-order optimization PDF

Cannot Refute

[28] Network Newton distributed optimization methods PDF

Cannot Refute

Contribution

Dimension-independent convergence rate for non-convex federated ZO optimization

[35] DeComFL: Federated Learning with Dimension-Free Communication PDF

Can Refute

[15] Reconciling Hessian-Informed Acceleration and Scalar-Only Communication for Efficient Federated Zeroth-Order Fine-Tuning PDF

Cannot Refute

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

Cannot Refute

[30] Communication-efficient zeroth-order distributed online optimization: Algorithm, theory, and applications PDF

Cannot Refute

[31] On the inherent privacy of two point zeroth order projected gradient descent PDF

Cannot Refute

[32] Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization PDF

Cannot Refute

[33] Dpzero: Private fine-tuning of language models without backpropagation PDF

Cannot Refute

[34] DPZero: Dimension-Independent and Differentially Private Zeroth-Order Optimization PDF

Cannot Refute

Converge Faster, Talk Less: Hessian-Informed Federated Zeroth-Order Optimization

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[15] Reconciling Hessian-Informed Acceleration and Scalar-Only Communication for Efficient Federated Zeroth-Order Fine-Tuning PDF

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

Contribution Analysis

Generalized scalar-only communication FL framework

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

[29] Byzantine-resilient zero-order optimization for scalable federated fine-tuning of large language models PDF

HiSo algorithm for Hessian-informed federated ZO optimization

[8] FedZeN: Quadratic Convergence in Zeroth-Order Federated Learning via Incremental Hessian Estimation PDF

[10] FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation PDF

[6] ZO-JADE: Zeroth-order curvature-aware distributed multi-agent convex optimization PDF

[7] Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer PDF

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

[24] Randomized Subspace Derivative-Free Optimization with Quadratic Models and Second-Order Convergence PDF

[25] Velocity-Free Distributed Optimization Algorithms for Second-Order Multiagent Systems PDF

[26] Distributed optimal resource allocation with secondâorder multiâAgent systems PDF

[27] Fine-grained theoretical analysis of federated zeroth-order optimization PDF

[28] Network Newton distributed optimization methods PDF

Dimension-independent convergence rate for non-convex federated ZO optimization

[35] DeComFL: Federated Learning with Dimension-Free Communication PDF

[15] Reconciling Hessian-Informed Acceleration and Scalar-Only Communication for Efficient Federated Zeroth-Order Fine-Tuning PDF

[16] HiSo: Efficient Federated Zeroth-Order Optimization via Hessian-Informed Acceleration and Scalar-Only Communication PDF

[30] Communication-efficient zeroth-order distributed online optimization: Algorithm, theory, and applications PDF

[31] On the inherent privacy of two point zeroth order projected gradient descent PDF

[32] Secure and fast asynchronous Vertical Federated Learning via cascaded hybrid optimization PDF

[33] Dpzero: Private fine-tuning of language models without backpropagation PDF

[34] DPZero: Dimension-Independent and Differentially Private Zeroth-Order Optimization PDF

Table of Contents

[26] Distributed optimal resource allocation with secondâorder multiâAgent systems PDF