BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

biosignalai for healthcarehumans and aiunsupervised cross-modal knowledge transfer

Biosignals offer valuable insights into the physiological states of the human body. Although biosignal modalities differ in functionality, signal fidelity, sensor comfort, and cost, they are often intercorrelated, reflecting the holistic and interconnected nature of human physiology. This opens up the possibility of performing the same tasks using alternative biosignal modalities, thereby improving the accessibility, usability, and adaptability of health monitoring systems. However, the limited availability of large labeled datasets presents challenges for training models tailored to specific tasks and modalities of interest. Unsupervised cross-modal knowledge transfer offers a promising solution by leveraging knowledge from an existing modality to support model training for a new modality. Existing methods are typically based on knowledge distillation, which requires running a teacher model alongside student model training, resulting in high computational and memory overhead. This challenge is further exacerbated by the recent development of foundation models that demonstrate superior performance and generalization across tasks at the cost of large model sizes. To this end, we explore a new framework for unsupervised cross-modal knowledge transfer of biosignals by training a lightweight bridge network to align the intermediate representations and enable information flow between foundation models and across modalities. Specifically, we introduce an efficient strategy for selecting alignment positions where the bridge should be constructed, along with a flexible prototype network as the bridge architecture. Extensive experiments across multiple biosignal modalities, tasks, and datasets show that BioX-Bridge reduces the number of trainable parameters by 88-99% while maintaining or even improving transfer performance compared to state-of-the-art methods.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes BioX-Bridge, a framework for unsupervised cross-modal knowledge transfer across biosignals, addressing the challenge of leveraging knowledge from one modality to train models for another without paired labels. Within the taxonomy, it resides in the 'Unsupervised Cross-Modal Transfer' leaf under 'Transfer Learning and Domain Adaptation', alongside four sibling papers. This leaf is moderately populated, suggesting an active but not overcrowded research direction. The taxonomy contains 50 papers across approximately 36 topics, indicating that unsupervised cross-modal transfer represents a focused subfield within the broader landscape of biosignal analysis and multimodal learning.

The taxonomy reveals several neighboring research directions that contextualize this work. The sibling leaf 'Supervised and Semi-Supervised Transfer' explores methods with labeled target data, while 'Domain Adaptation for Biosignals' addresses distribution shifts within single modalities. Adjacent branches include 'Cross-Modal Representation Learning and Alignment', which emphasizes contrastive learning and autoencoder-based alignment, and 'Foundation Models and Pretraining for Biosignals', which investigates large-scale pretraining strategies. BioX-Bridge diverges from contrastive alignment methods by focusing on parameter-efficient transfer mechanisms, and from foundation model approaches by targeting lightweight deployment scenarios where computational overhead is critical.

Among 20 candidates examined across three contributions, the analysis reveals mixed novelty signals. The core BioX-Bridge framework (Contribution 1) examined 10 candidates with no clear refutations, suggesting relative novelty in its overall approach. The two-stage bridge position selection strategy (Contribution 2) was not directly evaluated against prior work. However, the parameter-efficient transfer claim (Contribution 3) examined 10 candidates and found 2 refutable instances, indicating that parameter reduction techniques in cross-modal transfer have precedent. The limited search scope (20 candidates from semantic search) means these findings reflect top-K matches rather than exhaustive coverage, and additional related work may exist beyond this sample.

Based on the limited literature search, the work appears to offer incremental contributions in parameter-efficient cross-modal transfer, though the specific combination of techniques and application to biosignals may provide practical value. The analysis covers top-20 semantic matches and does not exhaustively survey all relevant prior work in knowledge distillation, adapter methods, or biosignal-specific transfer learning. A more comprehensive search might reveal additional overlapping methods or clarify the distinctiveness of the proposed bridge position selection strategy.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: unsupervised cross-modal knowledge transfer across biosignals. The field addresses the challenge of leveraging information from one biosignal modality to improve understanding or prediction in another, without requiring paired labels. The taxonomy reveals several complementary perspectives: Cross-Modal Representation Learning and Alignment focuses on learning shared embeddings and correspondences between modalities (e.g., Unsupervised Multimodal Personality[1], Cross-Modal Contrastive Emotion[2]); Transfer Learning and Domain Adaptation emphasizes adapting models across subjects, sessions, or sensor types (e.g., Unsupervised Sleep Transfer[4], SemiCMT[3]); Multimodal Fusion and Integration explores how to combine information from multiple biosignal streams; Foundation Models and Pretraining for Biosignals investigates large-scale pretraining strategies (e.g., Biot[9], Foundation Models Biosignals[38]); Cross-Modal Transfer in Medical Imaging extends these ideas to clinical data like X-rays and ECG (e.g., Contrastive X-ray ECG[23]); and Specialized Cross-Modal Applications targets domain-specific problems such as emotion recognition, sleep staging, and gesture decoding. A particularly active line of work involves contrastive and self-supervised methods that align representations without supervision, balancing the trade-off between modality-specific detail and shared semantic structure. Another contrasting direction uses explicit transfer mechanisms like domain adaptation or style transfer to bridge distributional gaps across modalities or subjects. BioX-Bridge[0] sits within the Transfer Learning and Domain Adaptation branch, specifically under Unsupervised Cross-Modal Transfer, where it shares thematic ground with works like SemiCMT[3] and Unsupervised Sleep Transfer[4]. Compared to SemiCMT[3], which may incorporate semi-supervised signals, BioX-Bridge[0] emphasizes fully unsupervised scenarios, while Unsupervised Sleep Transfer[4] focuses on a narrower application domain. The open question remains how to generalize these transfer strategies across diverse biosignal types and clinical contexts without sacrificing task-specific performance.

Claimed Contributions

BioX-Bridge framework for unsupervised cross-modal knowledge transfer

10 retrieved papers

The authors introduce a new framework that trains a lightweight bridge network to align intermediate representations between biosignal foundation models from different modalities, enabling knowledge transfer without requiring labeled data from the new modality. This approach reduces computational and memory overhead compared to traditional knowledge distillation methods.

10 retrieved papers

Two-stage bridge position selection strategy and prototype network architecture

0 retrieved papers

The authors develop a two-stage strategy that first selects the bridge input position by evaluating representation quality through linear probing, then selects the output position by measuring representation similarity using linear CKA. They also design a prototype network combining learnable prototypes with low-rank approximation to enable parameter-efficient projection between high-dimensional representation spaces.

0 retrieved papers

Parameter-efficient cross-modal transfer with 88-99% reduction in trainable parameters

Can Refute

10 retrieved papers

The authors demonstrate through extensive experiments across multiple biosignal modalities, tasks, and datasets that their method achieves comparable or superior performance to existing knowledge distillation approaches while drastically reducing the number of trainable parameters, making it more practical for resource-constrained settings.

10 retrieved papers

Can Refute

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[14] Cross-domain MLP and CNN transfer learning for biological signal processing: EEG and EMG PDF

Jordan J. Bird, Jhonatan Kobylarz, D. Faria, AnikÃ³ EkÃ¡rt, E. P. Ribeiro, Diego R. Faria, Eduardo P. Ribeiro (2020)

[20] Unsupervised Transfer Learning Across Different Data Modalities for Bearing's Speed Identification PDF

Diego Nieves Avendano, Dirk Deschrijver, Sofie Van Hoecke (2024)

[29] Unsupervised Domain Adaptation by Causal Learning for Biometric Signal-based HCI PDF

Qingfeng Dai, Yongkang Wong, Guofei Sun, Yanwei Wang, Zhou Zhou, Mohan S. Kankanhalli, Mohan Kankanhalli, Xiangdong Li, Weidong Geng, Wei-dong Geng (2023)

[42] Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer PDF

Jiaming Liu, Qizhe Zhang, Xiao-qi Li, Jianing Li, Guanqun Wang, Ming Lu, Tiejun Huang, Shanghang Zhang (2022) • IEEE International Conference on Robotics and Automation

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

BioX-Bridge framework for unsupervised cross-modal knowledge transfer

[25] CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer Learning PDF

Cannot Refute

[32] Transformer-based self-supervised multimodal representation learning for wearable emotion recognition PDF

Cannot Refute

[51] Transformer-based Self-supervised Representation Learning for Emotion Recognition Using Bio-signal Feature Fusion PDF

Cannot Refute

[52] PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition PDF

Cannot Refute

[53] Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning Framework PDF

Cannot Refute

[54] Self-supervised transfer learning of physiological representations from free-living wearable data PDF

Cannot Refute

[55] Non-contact detection of mental fatigue from facial expressions and heart signals: A self-supervised-based multimodal fusion method PDF

Cannot Refute

[56] Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data PDF

Cannot Refute

[57] A self-supervised multimodal framework for 1D physiological data fusion in remote health monitoring PDF

Cannot Refute

[58] Application of Multimodal Self-Supervised Architectures for Daily Life Affect Recognition PDF

Cannot Refute

Contribution

Two-stage bridge position selection strategy and prototype network architecture

Contribution

Parameter-efficient cross-modal transfer with 88-99% reduction in trainable parameters

[61] Parameter-Efficient Transfer Learning with Diff Pruning PDF

Can Refute

[68] Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference PDF

Can Refute

[59] St-adapter: Parameter-efficient image-to-video transfer learning PDF

Cannot Refute

[60] Conv-adapter: Exploring parameter efficient transfer learning for convnets PDF

Cannot Refute

[62] VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks PDF

Cannot Refute

[63] Uniadapter: Unified parameter-efficient transfer learning for cross-modal modeling PDF

Cannot Refute

[64] Towards a Unified View of Parameter-Efficient Transfer Learning PDF

Cannot Refute

[65] ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning for Action Recognition PDF

Cannot Refute

[66] Vmt-adapter: Parameter-efficient transfer learning for multi-task dense scene understanding PDF

Cannot Refute

[67] Parameter-Efficient Transfer Learning for NLP PDF

Cannot Refute

BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[14] Cross-domain MLP and CNN transfer learning for biological signal processing: EEG and EMG PDF

[20] Unsupervised Transfer Learning Across Different Data Modalities for Bearing's Speed Identification PDF

[29] Unsupervised Domain Adaptation by Causal Learning for Biometric Signal-based HCI PDF

[42] Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer PDF

Contribution Analysis

BioX-Bridge framework for unsupervised cross-modal knowledge transfer

[25] CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer Learning PDF

[32] Transformer-based self-supervised multimodal representation learning for wearable emotion recognition PDF

[51] Transformer-based Self-supervised Representation Learning for Emotion Recognition Using Bio-signal Feature Fusion PDF

[52] PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition PDF

[53] Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid Self-Supervised Learning Framework PDF

[54] Self-supervised transfer learning of physiological representations from free-living wearable data PDF

[55] Non-contact detection of mental fatigue from facial expressions and heart signals: A self-supervised-based multimodal fusion method PDF

[56] Beyond just vision: A review on self-supervised representation learning on multimodal and temporal data PDF

[57] A self-supervised multimodal framework for 1D physiological data fusion in remote health monitoring PDF

[58] Application of Multimodal Self-Supervised Architectures for Daily Life Affect Recognition PDF

Two-stage bridge position selection strategy and prototype network architecture

Parameter-efficient cross-modal transfer with 88-99% reduction in trainable parameters

[61] Parameter-Efficient Transfer Learning with Diff Pruning PDF

[68] Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference PDF

[59] St-adapter: Parameter-efficient image-to-video transfer learning PDF

[60] Conv-adapter: Exploring parameter efficient transfer learning for convnets PDF

[62] VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks PDF

[63] Uniadapter: Unified parameter-efficient transfer learning for cross-modal modeling PDF

[64] Towards a Unified View of Parameter-Efficient Transfer Learning PDF

[65] ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning for Action Recognition PDF

[66] Vmt-adapter: Parameter-efficient transfer learning for multi-task dense scene understanding PDF

[67] Parameter-Efficient Transfer Learning for NLP PDF

Table of Contents