Batch and Sequential Unlearning for Neural Networks

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

machine unlearningsecond-order unlearning

With the increasing deployment of machine learning models trained on personal data, machine unlearning has become crucial for data owners to exercise their "right to be forgotten" and protect their privacy. While model owners can retrain the models without the erased data to achieve this goal, this process is often prohibitively expensive. Previous works have shown that Newton's method can be applied to linear models to unlearn multiple data points in batch (batch unlearning) with minimal iterations. However, adapting this method to non-linear models, such as neural networks, poses significant challenges due to the presence of degenerate Hessians. This problem becomes more pronounced when unlearning is performed sequentially (sequential unlearning). Existing techniques that tried to tackle this degeneracy often 1) incur unlearning updates with excessively large norm that yield unsatisfactory unlearning performance and 2) may require manual tuning of regularization hyperparameters. In this work, we propose new unlearning algorithms that leverage cubic regularization for Newton's method to address both challenges. We discuss the theoretical benefits of our method and empirically show that our algorithms can efficiently achieve competitive performance in both batch and sequential unlearning on real-world datasets.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

This paper contributes two unlearning algorithms (CuReNU and StoCuReNU) that apply cubic regularization to Newton's method for handling degenerate Hessians in neural network unlearning. It sits in the 'Cubic Regularization for Degenerate Hessians' leaf of the taxonomy, which currently contains only this work as its sole member. This placement indicates a sparse research direction within the broader Hessian-Based Newton Methods branch, suggesting the specific technique of cubic regularization for unlearning represents a relatively unexplored approach compared to neighboring areas.

The taxonomy reveals that adjacent leaves contain related but distinct techniques: 'Standard Newton Updates' includes direct Hessian inversion approaches, while 'Hessian Inverse Approximation Techniques' encompasses low-rank updates and conjugate gradient methods. The parent branch also contains Hessian-Free Approaches using randomized approximations or Gauss-Newton formulations. The scope notes clarify that methods avoiding Hessian computation belong elsewhere, while this work explicitly addresses Hessian degeneracy through regularization. This positioning suggests the paper bridges standard Newton methods with practical challenges arising from ill-conditioned curvature geometry.

Among the three identified contributions, the literature search examined 21 candidates total. The first contribution (identifying Hessian degeneracy) examined 10 candidates with 1 appearing to provide overlapping prior work. The second contribution (CuReNU/StoCuReNU algorithms) examined 1 candidate that can refute novelty. The third contribution (scalable Hessian-free implementation) examined 10 candidates with 2 potentially refuting. These statistics indicate limited search scope focused on semantic neighbors rather than exhaustive coverage. The algorithmic contribution appears most vulnerable to prior work overlap, while the degeneracy analysis and scalability aspects show more novelty among examined candidates.

Based on examination of 21 semantically similar papers, the work appears to occupy a relatively sparse position within second-order unlearning methods, though the limited search scope prevents definitive assessment. The taxonomy structure suggests cubic regularization represents an underexplored direction compared to standard Newton updates or Hessian-free alternatives. However, the refutable pairs identified indicate that key elements—particularly the algorithmic framework and implementation strategies—may have meaningful overlap with existing techniques in the broader second-order optimization landscape.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: machine unlearning for neural networks using second-order methods. The field addresses the challenge of efficiently removing the influence of specific training data from trained neural networks without full retraining. The taxonomy reveals a structure organized around several key dimensions. Second-Order Optimization Foundations for Unlearning encompasses Hessian-based Newton methods and related curvature-aware techniques, including approaches that handle degenerate Hessians through cubic regularization or low-rank approximations, as seen in works like Newton Unlearning[28] and Hessian Low-Rank Perturbation[22]. Theoretical Guarantees and Certification focuses on formal verification and certified removal, exemplified by Certified Unlearning[2] and Hessian-Free Certified Unlearning[20]. Application Domains and Model Architectures spans diverse settings from recommender systems (Fast Recommender Forgetting[25]) to large language models (Second-Order LLM Unlearning[19], Soul LLM Unlearning[8]). Related Learning Paradigms connects unlearning to continual learning and meta-learning, while Robustness and Security examines adversarial threats such as malicious unlearning requests (Malicious Unlearning Attacks[12]). A central tension emerges between computational efficiency and theoretical rigor: second-order methods promise faster convergence and better approximations to exact retraining, yet computing or approximating Hessians remains expensive for large models. Many studies explore trade-offs between accuracy and scalability, with some leveraging randomized or Hessian-free techniques (Randomized Hessians[14], Hessian-Free Curvature[41]) to reduce overhead. Batch Sequential Unlearning[0] sits within the Hessian-Based Newton Methods branch, specifically addressing scenarios with degenerate Hessians through cubic regularization. This positions it closely alongside Newton Unlearning[28] and Gauss-Newton Unlearning[47], which similarly exploit second-order curvature information. Compared to first-order or influence-based alternatives like Selective Influence Unlearning[26], Batch Sequential Unlearning[0] emphasizes principled handling of ill-conditioned geometry, a recurring challenge when forget sets are small or atypical relative to the full training distribution.

Claimed Contributions

Identification of Hessian degeneracy as a fundamental issue in Newton unlearning for neural networks

Can Refute

10 retrieved papers

The authors demonstrate that Hessian degeneracy (many zero and near-zero eigenvalues) is a fundamental but often-overlooked problem in Newton unlearning for neural networks. They show that common baselines like pseudo-inverse and damping fail to address this issue effectively.

10 retrieved papers

Can Refute

CuReNU and StoCuReNU unlearning algorithms based on cubic regularization

Can Refute

1 retrieved paper

The authors introduce two novel unlearning algorithms that automatically determine the optimal damping factor for Newton unlearning using cubic regularization. CuReNU and StoCuReNU provide convergence guarantees to epsilon-second-order stationary points, addressing the Hessian degeneracy problem.

1 retrieved paper

Can Refute

Scalable Hessian-free implementation with constant memory usage

Can Refute

10 retrieved papers

The authors develop StoCuReNU as a scalable variant that uses Hessian-vector products instead of explicit Hessian storage, achieving constant memory usage of O(2d) compared to O(dn) in existing Hessian-free methods, while avoiding approximation errors.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Identification of Hessian degeneracy as a fundamental issue in Newton unlearning for neural networks

[28] On Newton's Method to Unlearn Neural Networks PDF

Can Refute

[6] Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning PDF

Cannot Refute

[10] Certified minimax unlearning with generalization rates and deletion capacity PDF

Cannot Refute

[11] Muter: Machine unlearning on adversarially trained models PDF

Cannot Refute

[14] Deep Unlearning via Randomized Conditionally Independent Hessians PDF

Cannot Refute

[17] A unified gradient-based framework for task-agnostic continual learning-unlearning PDF

Cannot Refute

[19] Second-order information matters: Revisiting machine unlearning for large language models PDF

Cannot Refute

[24] Unified gradient-based machine unlearning with remain geometry enhancement PDF

Cannot Refute

[56] A second-order perspective on model compositionality and incremental learning PDF

Cannot Refute

[57] Sharpness-Aware Parameter Selection for Machine Unlearning PDF

Cannot Refute

Contribution

CuReNU and StoCuReNU unlearning algorithms based on cubic regularization

[28] On Newton's Method to Unlearn Neural Networks PDF

Can Refute

Contribution

Scalable Hessian-free implementation with constant memory usage

[28] On Newton's Method to Unlearn Neural Networks PDF

Can Refute

[51] Efficient Online Unlearning via Hessian-Free Recollection of Individual Data Statistics PDF

Can Refute

[2] Towards certified unlearning for deep neural networks PDF

Cannot Refute

[11] Muter: Machine unlearning on adversarially trained models PDF

Cannot Refute

[14] Deep Unlearning via Randomized Conditionally Independent Hessians PDF

Cannot Refute

[46] Delete and Retain: Efficient Unlearning for Document Classification PDF

Cannot Refute

[52] Better Training Data Attribution via Better Inverse Hessian-Vector Products PDF

Cannot Refute

[53] Reconstruction attacks on machine unlearning: Simple models are vulnerable PDF

Cannot Refute

[54] Rethinking Evaluation Methods for Machine Unlearning PDF

Cannot Refute

[55] Efficient Knowledge Graph Unlearning with Zeroth-order Information PDF

Cannot Refute

Batch and Sequential Unlearning for Neural Networks

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Identification of Hessian degeneracy as a fundamental issue in Newton unlearning for neural networks

[28] On Newton's Method to Unlearn Neural Networks PDF

[6] Langevin unlearning: A new perspective of noisy gradient descent for machine unlearning PDF

[10] Certified minimax unlearning with generalization rates and deletion capacity PDF

[11] Muter: Machine unlearning on adversarially trained models PDF

[14] Deep Unlearning via Randomized Conditionally Independent Hessians PDF

[17] A unified gradient-based framework for task-agnostic continual learning-unlearning PDF

[19] Second-order information matters: Revisiting machine unlearning for large language models PDF

[24] Unified gradient-based machine unlearning with remain geometry enhancement PDF

[56] A second-order perspective on model compositionality and incremental learning PDF

[57] Sharpness-Aware Parameter Selection for Machine Unlearning PDF

CuReNU and StoCuReNU unlearning algorithms based on cubic regularization

[28] On Newton's Method to Unlearn Neural Networks PDF

Scalable Hessian-free implementation with constant memory usage

[28] On Newton's Method to Unlearn Neural Networks PDF

[51] Efficient Online Unlearning via Hessian-Free Recollection of Individual Data Statistics PDF

[2] Towards certified unlearning for deep neural networks PDF

[11] Muter: Machine unlearning on adversarially trained models PDF

[14] Deep Unlearning via Randomized Conditionally Independent Hessians PDF

[46] Delete and Retain: Efficient Unlearning for Document Classification PDF

[52] Better Training Data Attribution via Better Inverse Hessian-Vector Products PDF

[53] Reconstruction attacks on machine unlearning: Simple models are vulnerable PDF

[54] Rethinking Evaluation Methods for Machine Unlearning PDF

[55] Efficient Knowledge Graph Unlearning with Zeroth-order Information PDF

Table of Contents