Batch and Sequential Unlearning for Neural Networks

ICLR 2026 Conference SubmissionAnonymous Authors
machine unlearningsecond-order unlearning
Abstract:

With the increasing deployment of machine learning models trained on personal data, machine unlearning has become crucial for data owners to exercise their "right to be forgotten" and protect their privacy. While model owners can retrain the models without the erased data to achieve this goal, this process is often prohibitively expensive. Previous works have shown that Newton's method can be applied to linear models to unlearn multiple data points in batch (batch unlearning) with minimal iterations. However, adapting this method to non-linear models, such as neural networks, poses significant challenges due to the presence of degenerate Hessians. This problem becomes more pronounced when unlearning is performed sequentially (sequential unlearning). Existing techniques that tried to tackle this degeneracy often 1) incur unlearning updates with excessively large norm that yield unsatisfactory unlearning performance and 2) may require manual tuning of regularization hyperparameters. In this work, we propose new unlearning algorithms that leverage cubic regularization for Newton's method to address both challenges. We discuss the theoretical benefits of our method and empirically show that our algorithms can efficiently achieve competitive performance in both batch and sequential unlearning on real-world datasets.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

This paper contributes two unlearning algorithms (CuReNU and StoCuReNU) that apply cubic regularization to Newton's method for handling degenerate Hessians in neural network unlearning. It sits in the 'Cubic Regularization for Degenerate Hessians' leaf of the taxonomy, which currently contains only this work as its sole member. This placement indicates a sparse research direction within the broader Hessian-Based Newton Methods branch, suggesting the specific technique of cubic regularization for unlearning represents a relatively unexplored approach compared to neighboring areas.

The taxonomy reveals that adjacent leaves contain related but distinct techniques: 'Standard Newton Updates' includes direct Hessian inversion approaches, while 'Hessian Inverse Approximation Techniques' encompasses low-rank updates and conjugate gradient methods. The parent branch also contains Hessian-Free Approaches using randomized approximations or Gauss-Newton formulations. The scope notes clarify that methods avoiding Hessian computation belong elsewhere, while this work explicitly addresses Hessian degeneracy through regularization. This positioning suggests the paper bridges standard Newton methods with practical challenges arising from ill-conditioned curvature geometry.

Among the three identified contributions, the literature search examined 21 candidates total. The first contribution (identifying Hessian degeneracy) examined 10 candidates with 1 appearing to provide overlapping prior work. The second contribution (CuReNU/StoCuReNU algorithms) examined 1 candidate that can refute novelty. The third contribution (scalable Hessian-free implementation) examined 10 candidates with 2 potentially refuting. These statistics indicate limited search scope focused on semantic neighbors rather than exhaustive coverage. The algorithmic contribution appears most vulnerable to prior work overlap, while the degeneracy analysis and scalability aspects show more novelty among examined candidates.

Based on examination of 21 semantically similar papers, the work appears to occupy a relatively sparse position within second-order unlearning methods, though the limited search scope prevents definitive assessment. The taxonomy structure suggests cubic regularization represents an underexplored direction compared to standard Newton updates or Hessian-free alternatives. However, the refutable pairs identified indicate that key elements—particularly the algorithmic framework and implementation strategies—may have meaningful overlap with existing techniques in the broader second-order optimization landscape.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
21
Contribution Candidate Papers Compared
4
Refutable Paper

Research Landscape Overview

Core task: machine unlearning for neural networks using second-order methods. The field addresses the challenge of efficiently removing the influence of specific training data from trained neural networks without full retraining. The taxonomy reveals a structure organized around several key dimensions. Second-Order Optimization Foundations for Unlearning encompasses Hessian-based Newton methods and related curvature-aware techniques, including approaches that handle degenerate Hessians through cubic regularization or low-rank approximations, as seen in works like Newton Unlearning[28] and Hessian Low-Rank Perturbation[22]. Theoretical Guarantees and Certification focuses on formal verification and certified removal, exemplified by Certified Unlearning[2] and Hessian-Free Certified Unlearning[20]. Application Domains and Model Architectures spans diverse settings from recommender systems (Fast Recommender Forgetting[25]) to large language models (Second-Order LLM Unlearning[19], Soul LLM Unlearning[8]). Related Learning Paradigms connects unlearning to continual learning and meta-learning, while Robustness and Security examines adversarial threats such as malicious unlearning requests (Malicious Unlearning Attacks[12]). A central tension emerges between computational efficiency and theoretical rigor: second-order methods promise faster convergence and better approximations to exact retraining, yet computing or approximating Hessians remains expensive for large models. Many studies explore trade-offs between accuracy and scalability, with some leveraging randomized or Hessian-free techniques (Randomized Hessians[14], Hessian-Free Curvature[41]) to reduce overhead. Batch Sequential Unlearning[0] sits within the Hessian-Based Newton Methods branch, specifically addressing scenarios with degenerate Hessians through cubic regularization. This positions it closely alongside Newton Unlearning[28] and Gauss-Newton Unlearning[47], which similarly exploit second-order curvature information. Compared to first-order or influence-based alternatives like Selective Influence Unlearning[26], Batch Sequential Unlearning[0] emphasizes principled handling of ill-conditioned geometry, a recurring challenge when forget sets are small or atypical relative to the full training distribution.

Claimed Contributions

Identification of Hessian degeneracy as a fundamental issue in Newton unlearning for neural networks

The authors demonstrate that Hessian degeneracy (many zero and near-zero eigenvalues) is a fundamental but often-overlooked problem in Newton unlearning for neural networks. They show that common baselines like pseudo-inverse and damping fail to address this issue effectively.

10 retrieved papers
Can Refute
CuReNU and StoCuReNU unlearning algorithms based on cubic regularization

The authors introduce two novel unlearning algorithms that automatically determine the optimal damping factor for Newton unlearning using cubic regularization. CuReNU and StoCuReNU provide convergence guarantees to epsilon-second-order stationary points, addressing the Hessian degeneracy problem.

1 retrieved paper
Can Refute
Scalable Hessian-free implementation with constant memory usage

The authors develop StoCuReNU as a scalable variant that uses Hessian-vector products instead of explicit Hessian storage, achieving constant memory usage of O(2d) compared to O(dn) in existing Hessian-free methods, while avoiding approximation errors.

10 retrieved papers
Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Identification of Hessian degeneracy as a fundamental issue in Newton unlearning for neural networks

The authors demonstrate that Hessian degeneracy (many zero and near-zero eigenvalues) is a fundamental but often-overlooked problem in Newton unlearning for neural networks. They show that common baselines like pseudo-inverse and damping fail to address this issue effectively.

Contribution

CuReNU and StoCuReNU unlearning algorithms based on cubic regularization

The authors introduce two novel unlearning algorithms that automatically determine the optimal damping factor for Newton unlearning using cubic regularization. CuReNU and StoCuReNU provide convergence guarantees to epsilon-second-order stationary points, addressing the Hessian degeneracy problem.

Contribution

Scalable Hessian-free implementation with constant memory usage

The authors develop StoCuReNU as a scalable variant that uses Hessian-vector products instead of explicit Hessian storage, achieving constant memory usage of O(2d) compared to O(dn) in existing Hessian-free methods, while avoiding approximation errors.