Not All Clients Are Equal: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients

ICLR 2026 Conference SubmissionAnonymous Authors
Collaborative LearningFederated LearningContinual LearningMulti-modal LearningPersonalizationDistributed Learning
Abstract:

As AI becomes more personal, e.g., Agentic AI, there is an increasing need for personalizing models for various use cases. Personalized federated learning (PFL) enables each client to collaboratively leverage other clients' knowledge for better adaptation to the task of interest, without privacy risks. Despite its potential, existing PFL methods remain confined to rather simplified scenarios where data and models are the same across clients. To move towards realistic scenarios, we propose FedMosaic, a method that jointly addresses data and model heterogeneity with a task-relevance-aware model aggregation strategy to reduce parameter interference, and a dimension-invariant module that enables knowledge sharing across heterogeneous architectures without huge computational cost. To mimic the real-world task diversity, we propose a multi-modal PFL benchmark spanning 40 distinct tasks with distribution shifts over time. The empirical study shows that FedMosaic outperforms the state-of-the-art PFL methods, excelling in both personalization and generalization capabilities under challenging, realistic scenarios.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces FedMosaic, a method for personalized federated learning that jointly addresses data and model heterogeneity through task-relevance-aware aggregation and a dimension-invariant module. It resides in the Parameter-Level Heterogeneous Aggregation leaf, which contains four papers including this one. This leaf focuses on aggregating heterogeneous models through parameter factorization or modular networks, distinguishing itself from the six-paper Heterogeneous Model Aggregation via Knowledge Distillation leaf. The relatively small cluster suggests this specific approach to parameter-level aggregation represents an emerging but not yet crowded research direction within the broader model heterogeneity landscape.

The taxonomy reveals that FedMosaic's neighboring research directions include knowledge distillation-based aggregation (six papers), low-rank adaptation methods like LoRA (three papers), and mixture-of-experts approaches (two papers). These branches collectively address model heterogeneity but differ in mechanism: knowledge distillation operates at the feature level, LoRA focuses on parameter-efficient tuning, and mixture-of-experts employs sparse activation. FedMosaic's parameter-level approach sits between these strategies, sharing the goal of enabling cross-architecture collaboration while maintaining distinct technical foundations. The taxonomy's scope notes clarify that parameter-level methods explicitly exclude knowledge distillation, positioning FedMosaic in a complementary rather than overlapping space.

Among the twenty-nine candidates examined through semantic search, no papers were identified as clearly refuting any of FedMosaic's three contributions. The FedMosaic method itself was compared against ten candidates with zero refutable matches; the DRAKE benchmark against nine candidates with zero refutations; and the PQ-LoRA module against ten candidates with zero refutations. This limited search scope suggests that within the top-thirty semantically similar papers, the specific combination of task-relevance-aware aggregation, dimension-invariant modules, and multi-modal benchmarking appears relatively unexplored. However, the analysis does not claim exhaustive coverage of the broader federated learning literature beyond these candidates.

Based on the examined literature subset, FedMosaic appears to occupy a distinct position within parameter-level heterogeneous aggregation, particularly in its joint treatment of data and model heterogeneity with multi-modal considerations. The absence of refuting papers among thirty candidates suggests novelty within this search scope, though the analysis acknowledges limitations inherent to top-K semantic matching. The taxonomy context indicates this work contributes to an active but not saturated research direction, with neighboring methods pursuing complementary rather than directly competing strategies.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
29
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: personalized federated learning with heterogeneous data and models. The field addresses the challenge of training collaborative models across distributed clients that differ in both their local data distributions and computational capabilities or architectural preferences. The taxonomy reveals several complementary strategies: Model Heterogeneity and Architecture Personalization focuses on enabling clients to maintain distinct network structures or parameter configurations while still benefiting from federation; Data Heterogeneity Mitigation and Personalization tackles non-IID data through techniques like local adaptation layers or representation alignment; Meta-Learning for Personalization leverages frameworks such as MAML[9] to learn quickly adaptable initializations; Client Clustering and Grouping identifies subpopulations with similar characteristics to enable more targeted aggregation; Adaptive Personalization and Multi-Objective Optimization balances global knowledge sharing with local customization; Specialized Personalization Scenarios explores domain-specific settings like privacy constraints or IoT deployments; and Surveys and Theoretical Frameworks provide overarching perspectives on the landscape. A particularly active line of work within Model Heterogeneity involves parameter-level heterogeneous aggregation, where clients with different architectures or modalities exchange knowledge through carefully designed alignment mechanisms. Heterogeneous Multimodal Clients[0] sits squarely in this branch, addressing scenarios where clients process fundamentally different input types—such as text, images, or sensor data—yet seek to build personalized models that benefit from cross-modal insights. This contrasts with approaches like pFedClub[46] or Factorized FL[42], which also handle architectural diversity but typically assume a shared input modality and focus on factorizing or clustering model components. Meanwhile, works such as Ferrari[3] emphasize resource-aware personalization by dynamically adjusting model complexity per client, highlighting a trade-off between expressiveness and efficiency that Heterogeneous Multimodal Clients[0] must also navigate when integrating diverse data streams. The central open question remains how to effectively transfer knowledge across heterogeneous architectures and modalities without imposing prohibitive communication or computational overhead.

Claimed Contributions

FedMosaic method for heterogeneous personalized federated learning

FedMosaic is a personalized federated learning method that simultaneously handles data heterogeneity (clients working on different tasks) and model heterogeneity (clients using different architectures). It comprises two main components: RELA for task-relevance-aware aggregation and PQ-LoRA for dimension-invariant knowledge sharing across heterogeneous models.

10 retrieved papers
DRAKE benchmark for multi-modal personalized federated learning

DRAKE is a comprehensive benchmark for multi-modal federated learning that assigns each client a distinct multi-modal task (such as visual question answering or visual reasoning) and incorporates temporal distribution shifts to mimic real-world task diversity and evolving data distributions.

9 retrieved papers
PQ-LoRA for cross-architecture knowledge sharing

PQ-LoRA introduces dimension-invariant modules (matrices P and Q) within the LoRA framework whose dimensions depend only on the low-rank size rather than model-specific hidden dimensions, enabling parameter sharing and knowledge transfer across heterogeneous model architectures with different dimensions and depths.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

FedMosaic method for heterogeneous personalized federated learning

FedMosaic is a personalized federated learning method that simultaneously handles data heterogeneity (clients working on different tasks) and model heterogeneity (clients using different architectures). It comprises two main components: RELA for task-relevance-aware aggregation and PQ-LoRA for dimension-invariant knowledge sharing across heterogeneous models.

Contribution

DRAKE benchmark for multi-modal personalized federated learning

DRAKE is a comprehensive benchmark for multi-modal federated learning that assigns each client a distinct multi-modal task (such as visual question answering or visual reasoning) and incorporates temporal distribution shifts to mimic real-world task diversity and evolving data distributions.

Contribution

PQ-LoRA for cross-architecture knowledge sharing

PQ-LoRA introduces dimension-invariant modules (matrices P and Q) within the LoRA framework whose dimensions depend only on the low-rank size rather than model-specific hidden dimensions, enabling parameter sharing and knowledge transfer across heterogeneous model architectures with different dimensions and depths.