Abstract:

Foundation Vision-Language Models (VLMs) excel across benchmarks yet remain vulnerable to adversarial attacks. While adversarial fine-tuning improves robustness, attaining a desirable clean–robust performance trade-off typically requires costly hyperparameter searches with multiple retraining runs. A promising alternative is to merge task vectors (i.e., parameter displacements from pre-trained models) to balance accuracy and robustness without retraining. However, we find that naive task-vector merging produces a near-linear trade-off, as it equally weights all coordinates and fails to distinguish weights that aid both objectives from those that create conflicts. To overcome this limitation, we propose a prediction stability-aware merging framework that composes task vectors from off-the-shelf naturally and robustly fine-tuned VLMs. Our key insight is that prediction stability serves as a proxy for cross-objective compatibility, enabling us to favor perturbation-invariant parameters while attenuating those with high cross-objective impact. Specifically, we estimate per-parameter stability from gradients under both objectives, building complementary masks that retain jointly stable coordinates while suppressing counterpart-sensitive ones. We further refine these masks along adversarial parameter trajectories, with steps weighted by a prediction-sensitivity index. Our theoretical analysis shows that the masks provably contract first-order cross-objective interference, and the prediction criticality index tracks curvature, biasing the merge toward flatter minima and better generalization. Extensive experiments across benchmarks and scenarios demonstrate our method consistently achieves superior clean–robust trade-offs over prior approaches, with the learned balance transferring effectively to downstream tasks.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a prediction stability-aware framework (PISTOLE) for merging task vectors from naturally and robustly fine-tuned vision-language models, aiming to balance clean accuracy and adversarial robustness without retraining. Within the taxonomy, it resides in the 'Accuracy-Robustness Trade-off Optimization' leaf under 'Task Vector Merging Methodologies'. This leaf contains only two papers total, including the original work, indicating a relatively sparse and emerging research direction. The sibling paper addresses similar accuracy-robustness concerns but through different merging strategies, suggesting this specific sub-area is still in early development.

The taxonomy reveals that the broader 'Task Vector Merging Methodologies' branch encompasses three leaves: the original paper's leaf (accuracy-robustness focus), 'General Multi-Task Vector Merging' (three papers on multi-task learning without robustness emphasis), and 'Domain-Specific Applications' (two papers on robotics and unlearning). Neighboring branches include 'Multimodal Fusion Architectures' (six papers across emotion recognition, medical diagnosis, and instance recognition) and a survey paper. The original work diverges from general multi-task merging by explicitly targeting adversarial robustness, and from fusion architectures by operating at the parameter-level task vector arithmetic rather than learned fusion mechanisms.

Among thirteen candidates examined across three contributions, none were found to clearly refute the proposed methods. The PISTOLE framework examined two candidates with zero refutable matches; the gradient-informed stability masks examined nine candidates with zero refutable matches; and the theoretical analysis examined two candidates with zero refutable matches. This limited search scope—thirteen papers from semantic search and citation expansion—suggests that within the examined literature, the specific combination of prediction stability proxies, complementary masking, and cross-objective interference analysis appears relatively unexplored. However, the small candidate pool means substantial prior work may exist beyond this search radius.

Given the sparse taxonomy leaf (two papers total) and the absence of refutable candidates among thirteen examined works, the contributions appear to occupy a relatively novel position within the limited search scope. The framework's emphasis on prediction stability as a proxy for cross-objective compatibility distinguishes it from sibling work, though the small scale of the literature search (thirteen candidates) and the emerging nature of this sub-area mean that a more exhaustive review could reveal additional overlapping methods or theoretical foundations.

Taxonomy

Core-task Taxonomy Papers
11
3
Claimed Contributions
13
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Balancing accuracy and robustness in vision-language models through task vector merging. The field centers on combining separately fine-tuned model parameters to achieve multiple objectives simultaneously, rather than training from scratch for each new capability. The taxonomy reveals three main branches: Task Vector Merging Methodologies, which explores algorithmic strategies for combining task-specific weight deltas; Multimodal Fusion Architectures, which addresses how to integrate information across vision and language modalities; and Surveys and Comprehensive Reviews, which synthesize emerging trends and open challenges. Within Task Vector Merging Methodologies, a particularly active sub-area focuses on Accuracy-Robustness Trade-off Optimization, where methods like Adamerging[3] and Task Simplex Arithmetic[5] seek principled ways to interpolate between high-accuracy and high-robustness solutions. Meanwhile, works such as Merging Multimodal Models[4] and MBFusion[6] illustrate how fusion architectures can be designed or adapted to support merging operations, and Model Merging Survey[8] provides a broader perspective on the landscape. A central tension across these branches is whether to prioritize task-specific performance or generalization under distribution shift, and how to navigate this trade-off without expensive retraining. Some lines of work, such as Remedy[1] and DisTaC[9], emphasize stability and robustness by carefully controlling the merging coefficients or disentangling task-specific features. Others, like MergeVLA[10], extend merging ideas to embodied or vision-language-action settings, broadening the scope beyond static image-text tasks. The original paper, Stability-Aware Task Vector[0], sits squarely within the Accuracy-Robustness Trade-off Optimization cluster, sharing methodological kinship with DisTaC[9] in its attention to stability during merging. Compared to Adamerging[3], which adapts merging weights in a data-driven manner, Stability-Aware Task Vector[0] appears to place greater emphasis on ensuring that the merged model remains robust across diverse test conditions, reflecting an ongoing effort to make task vector arithmetic both effective and reliable in real-world deployments.

Claimed Contributions

Prediction stability-aware task vector merging framework (PISTOLE)

The authors introduce PISTOLE, a novel framework that merges task vectors from naturally and adversarially fine-tuned vision-language models without retraining. The method uses gradient-informed stability masks to selectively combine parameters that are stable under both objectives while suppressing those that create conflicts, achieving superior clean-robust trade-offs.

2 retrieved papers
Complementary gradient-informed stability masks with adversarial parameter trajectories

The method constructs complementary masks based on per-parameter stability estimated from gradient magnitudes under natural and robust objectives. These masks are refined by tracing adversarial parameter trajectories, with steps weighted by a prediction-sensitivity index to capture local loss-parameter geometry.

9 retrieved papers
Theoretical analysis of cross-objective interference contraction and curvature tracking

The authors provide theoretical guarantees demonstrating that their complementary masks provably reduce first-order interference between conflicting objectives. They also show that their prediction criticality index tracks Hessian trace (curvature), steering the merge toward flatter regions that generalize better.

2 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Prediction stability-aware task vector merging framework (PISTOLE)

The authors introduce PISTOLE, a novel framework that merges task vectors from naturally and adversarially fine-tuned vision-language models without retraining. The method uses gradient-informed stability masks to selectively combine parameters that are stable under both objectives while suppressing those that create conflicts, achieving superior clean-robust trade-offs.

Contribution

Complementary gradient-informed stability masks with adversarial parameter trajectories

The method constructs complementary masks based on per-parameter stability estimated from gradient magnitudes under natural and robust objectives. These masks are refined by tracing adversarial parameter trajectories, with steps weighted by a prediction-sensitivity index to capture local loss-parameter geometry.

Contribution

Theoretical analysis of cross-objective interference contraction and curvature tracking

The authors provide theoretical guarantees demonstrating that their complementary masks provably reduce first-order interference between conflicting objectives. They also show that their prediction criticality index tracks Hessian trace (curvature), steering the merge toward flatter regions that generalize better.