Can we generate portable representations for clinical time series data using LLMs?

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Machine Learning for HealthcareICU Time-seriesLLMsRepresentation Learning

Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors. Across three cohorts (MIMIC-IV, HIRID, PPICU), on multiple clinically grounded forecasting and classification tasks, we find that our approach is simple, easy to use and surprisingly competitive with in-distribution with grid imputation, self-supervised representation learning, and time series foundation models, while exhibiting smaller relative performance drops when transferring to new hospitals. We study the variation in performance across prompt design, with structured prompts being crucial to reducing the variance of the predictive models without altering mean accuracy. We find that using these portable representations improves few-shot learning and does not increase demographic recoverability of age or sex relative to baselines, suggesting little additional privacy risk. Our work points to the potential that LLMs hold as tools to enable the scalable deployment of production grade predictive models by reducing the engineering overhead.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a summarize-then-embed pipeline (Record2Vec) using frozen large language models to create portable patient representations from irregular ICU time series. It sits within the 'Contrastive and Self-Supervised Representation Learning for Clinical Time Series' leaf, which contains only three papers total. This is a relatively sparse research direction within the broader taxonomy of 41 papers across the field, suggesting the specific approach of leveraging frozen LLMs for clinical time series portability is not yet heavily explored in the literature examined.

The taxonomy reveals neighboring work in clinical vocabulary embeddings, multi-modal foundation models, and federated phenotyping from unstructured text. The paper diverges from these by focusing on time series rather than static codes or clinical notes, and by using frozen pretrained models rather than training embeddings from scratch. It connects to the broader domain adaptation branch through its emphasis on cross-hospital transferability, but differs by creating portable input representations rather than adapting trained models post-hoc.

Among 19 candidates examined across three contributions, only one refutable pair was identified for the summarize-then-embed pipeline itself. The deployment-first framing examined 10 candidates with none clearly refuting the contribution, while the multi-site evaluation examined 8 candidates with no refutations found. This limited search scope suggests the specific combination of natural language summarization and frozen embeddings for clinical time series portability has minimal direct overlap in the examined literature, though the search was not exhaustive.

Based on top-19 semantic matches and citation expansion, the work appears to occupy a relatively novel position combining frozen LLM summarization with cross-hospital portability objectives. The sparse taxonomy leaf and low refutation rate suggest limited prior work on this specific approach, though the analysis cannot rule out relevant work outside the examined candidate set.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Portable patient representation learning for cross-hospital clinical prediction. The field addresses the challenge of building clinical models that generalize across institutions despite heterogeneous data sources, privacy constraints, and distributional shifts. The taxonomy reveals five major branches: Federated Learning Architectures for Privacy-Preserving Cross-Institutional Collaboration encompasses methods like Federated Semi-supervised Healthcare[1] and Unitrans Federated Framework[3] that enable collaborative model training without sharing raw patient data. Representation Learning and Embedding Methods for Cross-Institutional Portability focuses on learning transferable patient embeddings through techniques including contrastive learning, as seen in Contrastive Pediatric Ventilation[2], and unified embedding spaces like Unified Clinical Embeddings[7]. Domain Adaptation and Transfer Learning for Cross-Hospital Generalization tackles distributional shifts using approaches such as AdaDiag Domain Adaptation[23] and Adversarial Glucose Transfer[39]. Task-Specific Cross-Hospital Prediction Models with Validation Studies demonstrate practical applications across diverse clinical scenarios, from GRU Ileus Surveillance[11] to Sarcopenic Obesity Prediction[13]. Finally, Data Harmonization and Preprocessing Pipelines for Multi-Institutional Research, exemplified by EHR Harmonization Pipeline[37], addresses the foundational challenge of standardizing heterogeneous clinical data. A central tension emerges between privacy-preserving federated approaches and representation-learning methods that require richer data sharing for effective embedding construction. Within the representation learning branch, contrastive and self-supervised techniques have gained traction for learning robust features from clinical time series without extensive labels. Portable Clinical LLMs[0] sits within this contrastive learning cluster, emphasizing self-supervised pretraining to capture temporal patterns that transfer across hospitals. This approach contrasts with Self-supervised Surgical Events[25], which focuses on surgical workflow rather than longitudinal patient trajectories, and aligns closely with Contrastive Pediatric Ventilation[2] in leveraging temporal contrasts for cross-institutional robustness. The interplay between learning portable representations and maintaining patient privacy remains an active research frontier, with works exploring whether foundation models like Multi-modal Foundation Models[4] can bridge these competing demands through large-scale pretraining.

Claimed Contributions

Deployment-first framing focusing on portable input representations for healthcare

10 retrieved papers

The authors propose a new perspective that treats portable input representations, rather than models themselves, as the primary transferable object across hospitals. This framing aims to reduce site-specific engineering overhead and calibration cycles when deploying clinical ML systems.

10 retrieved papers

Record2Vec: summarize-then-embed pipeline using frozen language models

Can Refute

1 retrieved paper

The authors introduce Record2Vec, a method that uses a frozen LLM to generate clinical summaries from irregular ICU time series data, then embeds these summaries with a frozen text encoder to produce fixed-length vectors. These vectors serve as portable inputs for downstream predictors without requiring model architecture modifications.

1 retrieved paper

Can Refute

Multi-site evaluation demonstrating portability, data-efficiency, and privacy preservation

8 retrieved papers

The authors perform comprehensive experiments across three ICU cohorts (MIMIC-IV, HiRID, PPICU) and multiple prediction tasks, demonstrating that their approach achieves competitive in-distribution performance while exhibiting better cross-site transfer, improved few-shot learning, and comparable or reduced demographic leakage compared to baseline methods.

8 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] Contrastive Representation Learning Helps Cross-institutional Knowledge Transfer: A Study in Pediatric Ventilation Management PDF

Liu Yuxuan, Han, Jinpei, Yuxuan Liu, Ramnarayan Padmanabhan, Jinpei Han, Faisal, A. Aldo, P. Ramnarayan, A. Faisal (2025)

[25] Forecasting adverse surgical events using self-supervised transfer learning for physiological signals PDF

Hugh Chen, Scott M. Lundberg, G. Erion, Jerry H. Kim, Su-In Lee, Gabriel Erion (2021)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Deployment-first framing focusing on portable input representations for healthcare

[22] HTPS: Heterogeneous Transferring Prediction System for Healthcare Datasets PDF

Cannot Refute

[42] Illusory generalizability of clinical prediction models PDF

Cannot Refute

[43] Generalizability of clinical prediction models in mental health PDF

Cannot Refute

[44] Https: Heterogeneous transfer learning for split prediction system evaluated on healthcare data PDF

Cannot Refute

[45] A Fully Open and Generalizable Foundation Model for Ultrasound Clinical Applications PDF

Cannot Refute

[46] Generalizable Seizure Prediction with LLMs: Converting EEG to Textual Representations. PDF

Cannot Refute

[47] A novel transfer learning based approach for pneumonia detection in chest X-ray images PDF

Cannot Refute

[48] TrajGPT: Irregular Time-Series Representation Learning of Health Trajectory PDF

Cannot Refute

[49] Fostering reproducibility and generalizability in machine learning for clinical prediction modeling in spine surgery PDF

Cannot Refute

[50] Development and transfer learning of self-attention model for major adverse cardiovascular events prediction across hospitals PDF

Cannot Refute

Contribution

Record2Vec: summarize-then-embed pipeline using frozen language models

[51] TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series PDF

Can Refute

Contribution

Multi-site evaluation demonstrating portability, data-efficiency, and privacy preservation

[52] Generalization in healthcare ai: Evaluation of a clinical large language model PDF

Cannot Refute

[53] Generative artificial intelligence enables the generation of bone scintigraphy images and improves generalization of deep learning models in data-constrained â¦ PDF

Cannot Refute

[54] Self-supervised normality learning and divergence vector-guided model merging for zero-shot congenital heart disease detection in fetal ultrasound videos PDF

Cannot Refute

[55] Real world federated learning with a knowledge distilled transformer for cardiac CT imaging PDF

Cannot Refute

[56] Assessing optimal methods for transferring machine learning models to low-volume and imbalanced clinical datasets: experiences from predicting outcomes of Danish trauma patients PDF

Cannot Refute

[57] A Foundation Model for Intensive Care Unlocking Generalization across Tasks and Domains at Scale PDF

Cannot Refute

[58] â¦ AI and Transfer Learning to Enhance Outcome Prediction for Out-of-Hospital Cardiac Arrest in Diverse Settings: Insights from the Pan-Asian Resuscitation Outcomes â¦ PDF

Cannot Refute

[59] Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records PDF

Cannot Refute

Can we generate portable representations for clinical time series data using LLMs?

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] Contrastive Representation Learning Helps Cross-institutional Knowledge Transfer: A Study in Pediatric Ventilation Management PDF

[25] Forecasting adverse surgical events using self-supervised transfer learning for physiological signals PDF

Contribution Analysis

Deployment-first framing focusing on portable input representations for healthcare

[22] HTPS: Heterogeneous Transferring Prediction System for Healthcare Datasets PDF

[42] Illusory generalizability of clinical prediction models PDF

[43] Generalizability of clinical prediction models in mental health PDF

[44] Https: Heterogeneous transfer learning for split prediction system evaluated on healthcare data PDF

[45] A Fully Open and Generalizable Foundation Model for Ultrasound Clinical Applications PDF

[46] Generalizable Seizure Prediction with LLMs: Converting EEG to Textual Representations. PDF

[47] A novel transfer learning based approach for pneumonia detection in chest X-ray images PDF

[48] TrajGPT: Irregular Time-Series Representation Learning of Health Trajectory PDF

[49] Fostering reproducibility and generalizability in machine learning for clinical prediction modeling in spine surgery PDF

[50] Development and transfer learning of self-attention model for major adverse cardiovascular events prediction across hospitals PDF

Record2Vec: summarize-then-embed pipeline using frozen language models

[51] TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series PDF

Multi-site evaluation demonstrating portability, data-efficiency, and privacy preservation

[52] Generalization in healthcare ai: Evaluation of a clinical large language model PDF

[53] Generative artificial intelligence enables the generation of bone scintigraphy images and improves generalization of deep learning models in data-constrained â¦ PDF

[54] Self-supervised normality learning and divergence vector-guided model merging for zero-shot congenital heart disease detection in fetal ultrasound videos PDF

[55] Real world federated learning with a knowledge distilled transformer for cardiac CT imaging PDF

[56] Assessing optimal methods for transferring machine learning models to low-volume and imbalanced clinical datasets: experiences from predicting outcomes of Danish trauma patients PDF

[57] A Foundation Model for Intensive Care Unlocking Generalization across Tasks and Domains at Scale PDF

[58] â¦ AI and Transfer Learning to Enhance Outcome Prediction for Out-of-Hospital Cardiac Arrest in Diverse Settings: Insights from the Pan-Asian Resuscitation Outcomes â¦ PDF

[59] Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records PDF

Table of Contents

[53] Generative artificial intelligence enables the generation of bone scintigraphy images and improves generalization of deep learning models in data-constrained â¦ PDF

[58] â¦ AI and Transfer Learning to Enhance Outcome Prediction for Out-of-Hospital Cardiac Arrest in Diverse Settings: Insights from the Pan-Asian Resuscitation Outcomes â¦ PDF