Directional Influence Function: Estimating Training Data Influence in Constrained Learning

ICLR 2026 Conference SubmissionAnonymous Authors
Directional Influence FunctionConstrained LearningDeep LearningSensitivity analysisVariational inequality
Abstract:

Constrained learning has been increasingly applied to various domains to ensure explicit feasibility requirements due to fairness, safety, robustness, regularization, and physics or logic constraints. Understanding how training samples influence the solution (e.g., learned parameters) of constrained learning is crucial for interpretability and robustness. The classical influence function (IF) may becomes unreliable in constrained settings: data perturbations can reshape both the objective and the feasible region, leading to estimates that violate feasibility. In response, we propose the Directional Influence Function (DIF), a new estimator that explicitly incorporates the constraints into influence estimation. DIF formulates the optimality conditions of constrained learning as a variational inequality (VI) and analyzes how perturbing training data affects this VI. We validate DIF in constrained linear regression and demonstrate that it recovers leave-one-out retraining results, whereas IF and penalty-based IF exhibit significant bias. We further apply DIF to fairness-constrained CNNs, where DIF accurately predicts test loss changes under data removal and aligns closely with actual retraining. Our results establish DIF as an efficient and reliable tool for data attribution in constrained learning.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces the Directional Influence Function (DIF) to estimate how training data perturbations affect solutions in constrained learning settings. It resides in the 'Influence Functions in Constrained Settings' leaf, which contains only two papers total (including this one). This places the work in a relatively sparse research direction within the broader taxonomy of 30 papers across influence estimation, data selection, fairness, and constrained optimization. The limited sibling count suggests that adapting influence functions to handle explicit constraints remains an underexplored niche.

The taxonomy reveals that neighboring branches address related but distinct challenges. 'Dynamics of Learning with Restricted Training Sets' (four papers) examines theoretical properties when training set size is proportional to dimensionality, while 'Instance-Level Fairness Impact Analysis' and 'Fairness-Constrained Classifier Training' focus on bias mitigation rather than general constraint handling. The 'Constrained Optimization and Learning' branch encompasses constraint learning and neural network methods but does not emphasize influence estimation. This structural separation indicates that DIF bridges a gap between classical influence analysis and the broader constrained optimization literature.

Among 30 candidates examined, the variational inequality formulation (Contribution 2) encountered two refutable candidates, suggesting some overlap with existing sensitivity analysis frameworks. In contrast, the core DIF estimator (Contribution 1) and the quadratic programming computation (Contribution 3) each examined 10 candidates with zero refutations, indicating less direct prior work within this limited search scope. The statistics imply that while the VI-based sensitivity framework connects to known techniques, the specific DIF construction and its computational approach appear more distinct among the top-30 semantic matches.

Based on the limited search scope of 30 candidates, the work appears to occupy a relatively novel position at the intersection of influence estimation and constrained learning. The sparse taxonomy leaf and low refutation counts for two of three contributions suggest incremental but meaningful extension of classical influence functions. However, the analysis does not cover exhaustive literature beyond top-K semantic retrieval, leaving open the possibility of additional relevant prior work in optimization theory or fairness-aware machine learning.

Taxonomy

Core-task Taxonomy Papers
30
3
Claimed Contributions
30
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: estimating training data influence in constrained learning. The field encompasses methods for understanding how individual training examples shape model behavior when learning is subject to constraints—whether those constraints arise from fairness requirements, optimization structure, or domain-specific restrictions. The taxonomy organizes this landscape into several main branches: Influence Estimation Methods and Theory develops foundational techniques (such as influence functions) to quantify data impact; Training Data Selection and Reduction focuses on pruning or prioritizing examples to improve efficiency; Fairness-Constrained Learning addresses scenarios where models must satisfy demographic parity or similar criteria; Constrained Optimization and Learning covers algorithmic frameworks that incorporate explicit constraints during training; and Application Domains illustrates how these ideas manifest in areas like medical imaging, astronomy, and conversational AI. Representative works such as Dataset Pruning[2] and EraseDiff[1] exemplify data selection strategies, while studies like Training fairness-constrained classifiers[14] and Neural networks for constrained[5] highlight the interplay between constraints and learning dynamics. A particularly active line of work explores how influence functions—originally designed for unconstrained settings—can be adapted when constraints are present, raising questions about computational tractability and theoretical guarantees. The Directional Influence Function[0] sits squarely within this branch, extending classical influence analysis to handle directional constraints and offering a principled way to assess data impact under such restrictions. This contrasts with nearby efforts like Right for Better Reasons[12], which emphasizes interpretability and causal reasoning in constrained contexts, and Understanding instance-level impact[13], which investigates per-example contributions more broadly. Meanwhile, works in fairness-constrained learning (e.g., Training fairness-constrained classifiers[14]) and constrained optimization (e.g., Learning constraints and optimization[11]) tackle related but distinct challenges—balancing multiple objectives or embedding hard constraints—underscoring ongoing debates about scalability, approximation quality, and the trade-offs between influence estimation accuracy and computational cost.

Claimed Contributions

Directional Influence Function (DIF) for constrained learning

The authors introduce DIF, a novel influence estimation method designed specifically for constrained learning problems. Unlike classical influence functions that fail under constraints, DIF uses directional derivatives to quantify how training data affects model solutions while respecting feasibility requirements imposed by constraints.

10 retrieved papers
Variational inequality formulation and sensitivity analysis framework

The authors formalize data attribution for constrained learning by casting optimality conditions as a variational inequality and performing local sensitivity analysis. This VI-based framework enables systematic analysis of how data perturbations affect solutions in the presence of constraints.

10 retrieved papers
Can Refute
Efficient quadratic programming computation of DIF

The authors show that computing DIF reduces to solving a quadratic program, providing an efficient computational method. They also establish that DIF generalizes classical influence functions, recovering them as a special case when no constraints are active.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Directional Influence Function (DIF) for constrained learning

The authors introduce DIF, a novel influence estimation method designed specifically for constrained learning problems. Unlike classical influence functions that fail under constraints, DIF uses directional derivatives to quantify how training data affects model solutions while respecting feasibility requirements imposed by constraints.

Contribution

Variational inequality formulation and sensitivity analysis framework

The authors formalize data attribution for constrained learning by casting optimality conditions as a variational inequality and performing local sensitivity analysis. This VI-based framework enables systematic analysis of how data perturbations affect solutions in the presence of constraints.

Contribution

Efficient quadratic programming computation of DIF

The authors show that computing DIF reduces to solving a quadratic program, providing an efficient computational method. They also establish that DIF generalizes classical influence functions, recovering them as a special case when no constraints are active.