SPICE: Submodular Penalized Information–Conflict Selection for Efficient Large Language Model Training

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Data selection; Submodular; Log-determinant Fisher information; Instruction tuning

Information-based data selection for instruction tuning is compelling: maximizing the log-determinant of the Fisher information yields a monotone submodular objective, enabling greedy algorithms to achieve a $(1-1/e)$ approximation under a cardinality budget. In practice, however, we identify alleviating gradient conflicts, misalignment between per-sample gradients, is a key factor that slows down the decay of marginal log-determinant information gains, thereby preventing significant loss of information. We formalize this via an $\varepsilon$ -decomposition that quantifies the deviation from ideal submodularity as a function of conflict statistics, yielding data-dependent approximation factors that tighten as conflicts diminish. Guided by this analysis, we propose SPICE, a conflict-aware selector that maximizes information while penalizing misalignment, and that supports early stopping and proxy models for efficiency. Empirically, SPICE selects subsets with higher log-determinant information than original criteria, and these informational gains translate into performance improvements: across 8 benchmarks with LLaMA2-7B and Qwen2-7B, SPICE uses only 10% of the data, yet matches or exceeds 6 methods including full-data tuning. This achieves performance improvements with substantially lower training cost. Code is available at https://anonymous.4open.science/r/SPICE-6DF7/README.md.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes SPICE, a conflict-aware data selector that maximizes Fisher information while penalizing gradient misalignment during instruction tuning. It resides in the 'Gradient-Based Influence and Information Metrics' leaf, which contains five papers total. This leaf sits within the broader 'Model-Intrinsic Quality Assessment' branch, indicating a moderately populated research direction focused on using internal model states to assess data utility. The taxonomy shows this is an active but not overcrowded area, with sibling leaves addressing uncertainty metrics and training trajectory analysis.

The taxonomy reveals neighboring work in 'Uncertainty and Consistency Metrics' (three papers) and 'Training Trajectory and Weight Dynamics' (two papers), both under the same parent branch. The paper's focus on gradient conflicts distinguishes it from uncertainty-based approaches like self-consistency probing, while its information-theoretic framing connects to influence function methods in sibling papers. The broader 'Selection Criteria and Quality Metrics' branch also includes external scoring methods and diversity metrics, suggesting the field balances model-intrinsic signals with heuristic or coverage-based strategies.

Among four candidates examined across three contributions, no clear refutations emerged. The ε-decomposition framework linking conflicts to information decay examined two candidates with no refutable overlap. The SPICE algorithm itself examined zero candidates, and the data-dependent approximation guarantees examined two candidates without finding prior work that directly anticipates this formulation. Given the limited search scope—only four candidates total—these statistics suggest the specific combination of conflict-aware selection and submodularity analysis may be relatively unexplored, though the small sample size precludes strong conclusions about absolute novelty.

Based on top-four semantic matches, the work appears to occupy a distinct niche within gradient-based selection methods. The conflict-aware framing and ε-decomposition analysis do not appear in the examined candidates, though the broader use of Fisher information and influence functions is well-established in the leaf's sibling papers. The analysis covers a narrow slice of the literature; a more exhaustive search might reveal closer precedents in optimization theory or active learning domains outside the instruction tuning context.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Data selection for efficient instruction tuning of large language models. The field has organized itself around several complementary dimensions. At the highest level, researchers distinguish between Selection Criteria and Quality Metrics (which define what makes an instruction example valuable), Selection Algorithms and Frameworks (which operationalize these criteria at scale), and Specialized Selection Contexts (addressing domain-specific or multimodal settings). Parallel branches examine Robustness and Security in Data Selection (guarding against adversarial or poisoned instructions), Efficient Training Techniques and Optimization (reducing computational overhead), and Evaluation and Benchmarking (measuring the impact of selection strategies). Application-Specific Instruction Tuning captures work tailored to particular downstream tasks or modalities. Within Selection Criteria and Quality Metrics, a dense cluster of methods relies on Model-Intrinsic Quality Assessment, using gradient-based influence and information metrics to score examples by their expected training utility, as seen in works like LESS[17] and In2Core[10]. A particularly active line of research focuses on gradient-based influence and information metrics, where methods estimate how much each instruction contributes to model updates or generalization. SPICE[0] falls squarely within this branch, leveraging influence functions to prioritize high-impact examples during instruction tuning. Nearby works such as LESS[17] and Uncertainty-Aware Influence[33] similarly exploit gradient information but differ in their treatment of uncertainty or approximation strategies. In contrast, SCAR[3] and Balanced Learning Selection[36] emphasize balancing coverage across diverse skills or concepts, highlighting a trade-off between influence-driven selection and representativeness. These gradient-centric approaches must also contend with computational costs and potential vulnerabilities to poisoning attacks, themes explored in the Robustness and Security branch. Overall, SPICE[0] exemplifies the trend toward principled, model-intrinsic scoring mechanisms that aim to distill large instruction pools into compact, high-quality subsets without sacrificing downstream performance.

Claimed Contributions

ε-decomposition framework linking gradient conflicts to marginal information gains decay

2 retrieved papers

The authors introduce a novel theoretical framework that decomposes marginal information gains into a modular baseline and a perturbation term. They prove that the perturbation magnitude, which governs how quickly marginal gains decay, is bounded by gradient inner products, thereby quantitatively connecting gradient conflicts to submodular data selection theory.

2 retrieved papers

SPICE algorithm for conflict-aware data selection

0 retrieved papers

The authors design SPICE (Submodular Penalized Information–Conflict sElection), a conflict-aware greedy selection algorithm that maximizes log-determinant Fisher information while adaptively penalizing gradient conflicts. The method incorporates data-driven early stopping and achieves efficient selection complexity of O(k|D|d).

0 retrieved papers

Data-dependent approximation guarantees via curvature control

2 retrieved papers

The authors establish formal bounds showing that controlling gradient conflicts (via the ε-decomposition) reduces submodular curvature, thereby improving greedy approximation guarantees beyond the standard (1-1/e) factor. This provides tighter, data-dependent approximation factors that improve as conflicts diminish.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[10] In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models PDF

Ayrton San Joaquin, Bin Wang, Zhengyuan Liu, Philippe Muller, Nicholas Asher, Brian Lim, Nancy F. Chen (2024) • Conference on Empirical Methods in Natural Language Processing

[17] LESS: Selecting Influential Data for Targeted Instruction Tuning PDF

Xia, Mengzhou, Mengzhou Xia, Malladi, Sadhika, Sadhika Malladi, Gururangan, Suchin, Suchin Gururangan, Arora, Sanjeev, Sanjeev Arora, Chen, Danqi, Danqi Chen (2024)

[33] Automatic Instruction Data Selection for Large Language Models via Uncertainty-Aware Influence Maximization PDF

Jindong Han, Hao Liu, Jun Fang, Naiqiang Tan, Hui Xiong (2025)

[36] Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities PDF

Qirun Dai, Dylan Zhang, Jiaqi W. Ma, Hao Peng (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

ε-decomposition framework linking gradient conflicts to marginal information gains decay

[51] Soft Conflict-Resolution Decision Transformer for Offline Multi-Task Reinforcement Learning PDF

Cannot Refute

[52] Mitigating Forgetting in Continual Learning with Selective Gradient Projection PDF

Cannot Refute

Contribution

SPICE algorithm for conflict-aware data selection

Contribution

Data-dependent approximation guarantees via curvature control

[53] Market-Driven Subset Selection for Budgeted Training PDF

Cannot Refute

[54] Submodular Data Selection and Augmentation for Resource-Efficient Learning PDF

Cannot Refute

SPICE: Submodular Penalized Information–Conflict Selection for Efficient Large Language Model Training

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[10] In2Core: Leveraging Influence Functions for Coreset Selection in Instruction Finetuning of Large Language Models PDF

[17] LESS: Selecting Influential Data for Targeted Instruction Tuning PDF

[33] Automatic Instruction Data Selection for Large Language Models via Uncertainty-Aware Influence Maximization PDF

[36] Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities PDF

Contribution Analysis

ε-decomposition framework linking gradient conflicts to marginal information gains decay

[51] Soft Conflict-Resolution Decision Transformer for Offline Multi-Task Reinforcement Learning PDF

[52] Mitigating Forgetting in Continual Learning with Selective Gradient Projection PDF

SPICE algorithm for conflict-aware data selection

Data-dependent approximation guarantees via curvature control

[53] Market-Driven Subset Selection for Budgeted Training PDF

[54] Submodular Data Selection and Augmentation for Resource-Efficient Learning PDF

Table of Contents