Tug-of-War No More: Harmonizing Accuracy and Robustness in Vision-Language Models via Stability-Aware Task Vector Merging

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.5 Download Report PDF

Vision-Language ModelTask VectorTrade-OffRobustness

Foundation Vision-Language Models (VLMs) excel across benchmarks yet remain vulnerable to adversarial attacks. While adversarial fine-tuning improves robustness, attaining a desirable clean–robust performance trade-off typically requires costly hyperparameter searches with multiple retraining runs. A promising alternative is to merge task vectors (i.e., parameter displacements from pre-trained models) to balance accuracy and robustness without retraining. However, we find that naive task-vector merging produces a near-linear trade-off, as it equally weights all coordinates and fails to distinguish weights that aid both objectives from those that create conflicts. To overcome this limitation, we propose a prediction stability-aware merging framework that composes task vectors from off-the-shelf naturally and robustly fine-tuned VLMs. Our key insight is that prediction stability serves as a proxy for cross-objective compatibility, enabling us to favor perturbation-invariant parameters while attenuating those with high cross-objective impact. Specifically, we estimate per-parameter stability from gradients under both objectives, building complementary masks that retain jointly stable coordinates while suppressing counterpart-sensitive ones. We further refine these masks along adversarial parameter trajectories, with steps weighted by a prediction-sensitivity index. Our theoretical analysis shows that the masks provably contract first-order cross-objective interference, and the prediction criticality index tracks curvature, biasing the merge toward flatter minima and better generalization. Extensive experiments across benchmarks and scenarios demonstrate our method consistently achieves superior clean–robust trade-offs over prior approaches, with the learned balance transferring effectively to downstream tasks.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a prediction stability-aware framework (PISTOLE) for merging task vectors from naturally and robustly fine-tuned vision-language models, aiming to balance clean accuracy and adversarial robustness without retraining. Within the taxonomy, it resides in the 'Accuracy-Robustness Trade-off Optimization' leaf under 'Task Vector Merging Methodologies'. This leaf contains only two papers total, including the original work, indicating a relatively sparse and emerging research direction. The sibling paper addresses similar accuracy-robustness concerns but through different merging strategies, suggesting this specific sub-area is still in early development.

The taxonomy reveals that the broader 'Task Vector Merging Methodologies' branch encompasses three leaves: the original paper's leaf (accuracy-robustness focus), 'General Multi-Task Vector Merging' (three papers on multi-task learning without robustness emphasis), and 'Domain-Specific Applications' (two papers on robotics and unlearning). Neighboring branches include 'Multimodal Fusion Architectures' (six papers across emotion recognition, medical diagnosis, and instance recognition) and a survey paper. The original work diverges from general multi-task merging by explicitly targeting adversarial robustness, and from fusion architectures by operating at the parameter-level task vector arithmetic rather than learned fusion mechanisms.

Among thirteen candidates examined across three contributions, none were found to clearly refute the proposed methods. The PISTOLE framework examined two candidates with zero refutable matches; the gradient-informed stability masks examined nine candidates with zero refutable matches; and the theoretical analysis examined two candidates with zero refutable matches. This limited search scope—thirteen papers from semantic search and citation expansion—suggests that within the examined literature, the specific combination of prediction stability proxies, complementary masking, and cross-objective interference analysis appears relatively unexplored. However, the small candidate pool means substantial prior work may exist beyond this search radius.

Given the sparse taxonomy leaf (two papers total) and the absence of refutable candidates among thirteen examined works, the contributions appear to occupy a relatively novel position within the limited search scope. The framework's emphasis on prediction stability as a proxy for cross-objective compatibility distinguishes it from sibling work, though the small scale of the literature search (thirteen candidates) and the emerging nature of this sub-area mean that a more exhaustive review could reveal additional overlapping methods or theoretical foundations.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Balancing accuracy and robustness in vision-language models through task vector merging. The field centers on combining separately fine-tuned model parameters to achieve multiple objectives simultaneously, rather than training from scratch for each new capability. The taxonomy reveals three main branches: Task Vector Merging Methodologies, which explores algorithmic strategies for combining task-specific weight deltas; Multimodal Fusion Architectures, which addresses how to integrate information across vision and language modalities; and Surveys and Comprehensive Reviews, which synthesize emerging trends and open challenges. Within Task Vector Merging Methodologies, a particularly active sub-area focuses on Accuracy-Robustness Trade-off Optimization, where methods like Adamerging[3] and Task Simplex Arithmetic[5] seek principled ways to interpolate between high-accuracy and high-robustness solutions. Meanwhile, works such as Merging Multimodal Models[4] and MBFusion[6] illustrate how fusion architectures can be designed or adapted to support merging operations, and Model Merging Survey[8] provides a broader perspective on the landscape. A central tension across these branches is whether to prioritize task-specific performance or generalization under distribution shift, and how to navigate this trade-off without expensive retraining. Some lines of work, such as Remedy[1] and DisTaC[9], emphasize stability and robustness by carefully controlling the merging coefficients or disentangling task-specific features. Others, like MergeVLA[10], extend merging ideas to embodied or vision-language-action settings, broadening the scope beyond static image-text tasks. The original paper, Stability-Aware Task Vector[0], sits squarely within the Accuracy-Robustness Trade-off Optimization cluster, sharing methodological kinship with DisTaC[9] in its attention to stability during merging. Compared to Adamerging[3], which adapts merging weights in a data-driven manner, Stability-Aware Task Vector[0] appears to place greater emphasis on ensuring that the merged model remains robust across diverse test conditions, reflecting an ongoing effort to make task vector arithmetic both effective and reliable in real-world deployments.

Claimed Contributions

Prediction stability-aware task vector merging framework (PISTOLE)

2 retrieved papers

The authors introduce PISTOLE, a novel framework that merges task vectors from naturally and adversarially fine-tuned vision-language models without retraining. The method uses gradient-informed stability masks to selectively combine parameters that are stable under both objectives while suppressing those that create conflicts, achieving superior clean-robust trade-offs.

2 retrieved papers

Complementary gradient-informed stability masks with adversarial parameter trajectories

9 retrieved papers

The method constructs complementary masks based on per-parameter stability estimated from gradient magnitudes under natural and robust objectives. These masks are refined by tracing adversarial parameter trajectories, with steps weighted by a prediction-sensitivity index to capture local loss-parameter geometry.

9 retrieved papers

Theoretical analysis of cross-objective interference contraction and curvature tracking

2 retrieved papers

The authors provide theoretical guarantees demonstrating that their complementary masks provably reduce first-order interference between conflicting objectives. They also show that their prediction criticality index tracks Hessian trace (curvature), steering the merge toward flatter regions that generalize better.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[9] DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging PDF

Yoshida Kotaro, Kotaro Yoshida, Horie Takafumi, Yuji Naraki, Shimizu, Ryotaro, Takafumi Horie, Naganuma, Hiroki, Ryotaro Shimizu, Hiroki Naganuma (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Prediction stability-aware task vector merging framework (PISTOLE)

[5] Machine unlearning via task simplex arithmetic PDF

Cannot Refute

[12] MMDU-Bench: Multi-modal Deep Unlearning Benchmark PDF

Cannot Refute

Contribution

Complementary gradient-informed stability masks with adversarial parameter trajectories

[8] Scaling Intelligence Through Model Merging: A Comprehensive Survey PDF

Cannot Refute

[15] Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion PDF

Cannot Refute

[16] Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance PDF

Cannot Refute

[17] Momentum-based gradient methods in multi-objective recommendation PDF

Cannot Refute

[18] DGM-CAM: A Lifelong Editing Framework using Dynamic Gradient Masking and Conflict-Aware Merging PDF

Cannot Refute

[19] LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint PDF

Cannot Refute

[20] Gradient-based design robustness measure for robust geotechnical design PDF

Cannot Refute

[21] Gradient-Enhanced Evolutionary Multi-Objective Optimization (GEEMOO): Balancing Relevance, Learning Outcomes, and Diversity in Educational â¦ PDF

Cannot Refute

[22] Simulating the Skies: Unleashing AI for Adaptive Airborne Defense PDF

Cannot Refute

Contribution

Theoretical analysis of cross-objective interference contraction and curvature tracking

[13] Mitigating parameter interference in model merging via sharpness-aware fine-tuning PDF

Cannot Refute

[14] Harnessing Optimization Dynamics for Curvature-Informed Model Merging PDF

Cannot Refute

Tug-of-War No More: Harmonizing Accuracy and Robustness in Vision-Language Models via Stability-Aware Task Vector Merging

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[9] DisTaC: Conditioning Task Vectors via Distillation for Robust Model Merging PDF

Contribution Analysis

Prediction stability-aware task vector merging framework (PISTOLE)

[5] Machine unlearning via task simplex arithmetic PDF

[12] MMDU-Bench: Multi-modal Deep Unlearning Benchmark PDF

Complementary gradient-informed stability masks with adversarial parameter trajectories

[8] Scaling Intelligence Through Model Merging: A Comprehensive Survey PDF

[15] Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion PDF

[16] Three-way trade-off in multi-objective learning: Optimization, generalization and conflict-avoidance PDF

[17] Momentum-based gradient methods in multi-objective recommendation PDF

[18] DGM-CAM: A Lifelong Editing Framework using Dynamic Gradient Masking and Conflict-Aware Merging PDF

[19] LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint PDF

[20] Gradient-based design robustness measure for robust geotechnical design PDF

[21] Gradient-Enhanced Evolutionary Multi-Objective Optimization (GEEMOO): Balancing Relevance, Learning Outcomes, and Diversity in Educational â¦ PDF

[22] Simulating the Skies: Unleashing AI for Adaptive Airborne Defense PDF

Theoretical analysis of cross-objective interference contraction and curvature tracking

[13] Mitigating parameter interference in model merging via sharpness-aware fine-tuning PDF

[14] Harnessing Optimization Dynamics for Curvature-Informed Model Merging PDF

Table of Contents

[21] Gradient-Enhanced Evolutionary Multi-Objective Optimization (GEEMOO): Balancing Relevance, Learning Outcomes, and Diversity in Educational â¦ PDF