Abstract:

Model merging plays a crucial role in consolidating multiple specialized models into a single, unified model, especially in the era of large language models (LLMs). Recent research has primarily focused on developing strategies to enhance merging performance with the trained models, while the impact of training paradigms, such as supervised fine-tuning (SFT) and reinforcement learning (RL), on the effectiveness of model merging remains underexplored. In this study, we systematically explore the merging behavior of RL-trained LLMs compared to those trained with traditional SFT. Through comprehensive evaluations across five representative tasks, we find that RL significantly reduces task conflicts and results in less performance degradation after merging, making RL-trained models particularly well-suited for this process. To unearth the reasons behind the superior suitability of RL for model merging, we conduct extensive empirical experiments and theoretical analyses. Our findings highlight three key factors: (1) On-policy training data in RL control the gradient updates in a smaller magnitude, reducing the risk of overwriting existing knowledge for other tasks in the model. (2) The RL optimization objective, which favors "\textit{enough is as good as a feast}", progressively reduces the magnitude of parameter updates as the model converges, thereby alleviating inter-task conflicts. (3) Joint optimization of positive and negative examples in RL steers the model towards an unbiased task-specific parameter subspace, ensuring robust performance while further preventing parameter conflicts.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper investigates how training paradigms—specifically supervised fine-tuning versus reinforcement learning—affect model merging effectiveness in large language models. It positions itself within the Parameter-Level Conflict Characterization leaf of the taxonomy, which contains only three papers total. This leaf focuses on analyzing interference patterns at the weight or neuron level to understand conflict origins. The sparse population suggests this is a relatively underexplored research direction, particularly regarding how training methodology influences mergability rather than post-hoc merging techniques themselves.

The taxonomy reveals a field heavily weighted toward merging techniques (training-free and training-dependent branches contain numerous papers) rather than foundational analysis of what makes models mergeable. The paper's neighboring leaves examine representation bias and distribution gaps, while sibling papers like Localizing Task Information and Spark of Neuron analyze where task knowledge resides and neuron-level activation patterns. This work diverges by examining training-time factors rather than post-training parameter analysis, connecting to the broader Training-Dependent Merging Approaches branch through its focus on how models are prepared for merging.

Among thirty candidates examined across three contributions, none were identified as clearly refuting the work. The systematic comparison of SFT versus RL paradigms examined ten candidates with zero refutable overlaps, as did the three-factor theoretical analysis and the demonstration of reduced task conflicts. This suggests the specific angle—training paradigm impact on mergability—has limited direct prior work within the search scope. However, the analysis explicitly notes this represents a limited literature search via top-K semantic matching, not an exhaustive field survey.

The contribution appears relatively novel within the examined scope, particularly in shifting focus from merging algorithms to training methodology. The sparse Parameter-Level Conflict Characterization leaf and absence of refuting candidates among thirty examined papers suggest this training-paradigm perspective fills a gap. However, the limited search scope and the field's rapid evolution mean comprehensive novelty assessment would require broader examination beyond semantic similarity matching.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: mitigating task conflicts in model merging for large language models. The field addresses how to combine multiple fine-tuned models without catastrophic interference, organizing itself into several major branches. Conflict Detection and Analysis focuses on identifying and characterizing parameter-level disagreements, with works like Localizing Task Information[16] and Spark of Neuron[21] examining where task-specific knowledge resides. Training-Free Merging Techniques, exemplified by TIES Merging[4] and Task Arithmetic[13], seek efficient combination strategies that avoid retraining, while Training-Dependent Merging Approaches such as AdaMerging[5] and Led Merging[2] optimize merge coefficients through additional learning. Post-Merging Refinement Techniques like Representation Surgery[23] adjust merged models after combination, and Domain-Specific Merging Applications explore targeted use cases. Continual and Federated Merging handles sequential or distributed scenarios, Security and Robustness Considerations address vulnerabilities like Merge Hijacking[6], and Alternative Knowledge Integration Paradigms such as Knowledge Grafting[3] explore fundamentally different composition strategies. A particularly active tension exists between training-free efficiency and training-dependent accuracy, with many studies exploring whether lightweight post-hoc methods can match optimization-based approaches. Security concerns have also emerged, as works like Neutralizing Backdoors[12] and Safety Aware Subspace[10] reveal that merging can inadvertently propagate or amplify harmful behaviors. Within the Conflict Detection and Analysis branch, Enough is Good[0] sits alongside parameter-level characterization efforts, examining how much conflict resolution is actually necessary for effective merging. Compared to neighbors like Localizing Task Information[16], which maps where task knowledge concentrates, and Spark of Neuron[21], which analyzes neuron-level activation patterns, Enough is Good[0] appears to question the sufficiency of existing conflict mitigation strategies, potentially offering a more pragmatic perspective on when elaborate conflict resolution yields diminishing returns versus simpler merging baselines.

Claimed Contributions

Systematic comparison of SFT and RL paradigms for model merging

The authors conduct comprehensive experiments across five representative tasks to systematically compare how models trained with supervised fine-tuning versus reinforcement learning behave when merged. They demonstrate that RL-trained models consistently preserve performance better after merging, regardless of merging methods, RL algorithms, or base models used.

10 retrieved papers
Three-factor theoretical and empirical analysis of RL superiority

The authors identify and analyze three key mechanisms explaining why RL mitigates task conflicts: on-policy data reduces gradient magnitudes, RL optimization objectives naturally attenuate parameter updates as models converge (enough is as good as a feast principle), and joint optimization over positive and negative examples leads to more unbiased task-specific parameter updates.

10 retrieved papers
Demonstration that RL reduces task conflicts in model merging

Through performance landscape visualization and conflict norm analysis, the authors show that RL-trained models exhibit significantly lower cross-task parameter interference compared to SFT models. They demonstrate that parameter updates from RL are more task-orthogonal and less disruptive when merged, while SFT updates tend to be more entangled across tasks.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Systematic comparison of SFT and RL paradigms for model merging

The authors conduct comprehensive experiments across five representative tasks to systematically compare how models trained with supervised fine-tuning versus reinforcement learning behave when merged. They demonstrate that RL-trained models consistently preserve performance better after merging, regardless of merging methods, RL algorithms, or base models used.

Contribution

Three-factor theoretical and empirical analysis of RL superiority

The authors identify and analyze three key mechanisms explaining why RL mitigates task conflicts: on-policy data reduces gradient magnitudes, RL optimization objectives naturally attenuate parameter updates as models converge (enough is as good as a feast principle), and joint optimization over positive and negative examples leads to more unbiased task-specific parameter updates.

Contribution

Demonstration that RL reduces task conflicts in model merging

Through performance landscape visualization and conflict norm analysis, the authors show that RL-trained models exhibit significantly lower cross-task parameter interference compared to SFT models. They demonstrate that parameter updates from RL are more task-orthogonal and less disruptive when merged, while SFT updates tend to be more entangled across tasks.

Enough is as good as a feast: A Comprehensive Analysis of How Reinforcement Learning Mitigates Task Conflicts in LLMs | Novelty Validation