CHAMMI-75: pre-training multi-channel models with heterogeneous microscopy images

ICLR 2026 Conference SubmissionAnonymous Authors
microscopyrepresentation learningmulti-channel imagingself-supervised learningbiology
Abstract:

Quantifying cell morphology using images and machine learning has proven to be a powerful tool to study the response of cells to treatments. However, the models used to quantify cellular morphology are typically trained with a single microscopy imaging type and under controlled experimental conditions. This results in specialized models that cannot be reused across biological studies because the technical specifications do not match (e.g., different number of channels), or because the target experimental conditions are out of distribution. Here, we present CHAMMI-75, a dataset of heterogeneous, multi-channel microscopy images with 2.8M multi-channel images from 75 diverse biological studies. We curated this resource from publicly available sources to investigate cellular morphology models that are channel-adaptive and can process any microscopy image type. Our experiments show that training with CHAMMI-75 can improve performance in multi-channel bioimaging tasks, opening the way to create the next generation of cellular morphology models for biological studies.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces CHAMMI-75, a large-scale dataset of 2.8M multi-channel microscopy images from 75 biological studies, and demonstrates its utility for pre-training channel-adaptive models. It resides in the Channel-Adaptive Vision Transformers leaf, which contains four papers total. This is a relatively sparse research direction within the broader Self-Supervised Pre-Training Architectures branch, suggesting the problem of handling variable channel configurations in microscopy remains under-explored compared to supervised segmentation methods or phenotypic profiling pipelines.

The taxonomy reveals neighboring work in Masked Autoencoding for Multi-Channel Images and Contrastive and Paired-Cell Learning, both exploring self-supervised objectives but without explicit channel-adaptive mechanisms. Supervised Segmentation and Detection methods like Cellpose Multi-Modality assume fixed channel counts, while Phenotypic Profiling and Drug Discovery Applications focus on extracting biological features rather than architectural flexibility. CHAMMI-75 bridges these areas by providing a foundation for models that generalize across imaging protocols, diverging from task-specific supervised approaches and complementing self-supervised methods that lack channel adaptability.

Among 30 candidates examined, the dataset contribution shows one refutable candidate from 10 examined, indicating some prior work on multi-channel microscopy collections exists but may differ in scale or diversity. The benchmarking contribution and systematic experimental evaluation each examined 10 candidates with zero refutations, suggesting these aspects are less directly addressed in prior literature. The limited search scope means these statistics reflect top-30 semantic matches rather than exhaustive coverage, so additional relevant datasets or evaluation frameworks may exist beyond this analysis window.

Based on the top-30 semantic search results, CHAMMI-75 appears to occupy a moderately novel position by combining large-scale heterogeneous data curation with channel-adaptive pre-training experiments. The dataset contribution has some overlap with existing resources, while the benchmarking and evaluation aspects show less direct prior work within the examined candidates. The sparse Channel-Adaptive Vision Transformers leaf suggests this research direction is still emerging, though the analysis does not cover all possible related work in computer vision or biomedical imaging.

Taxonomy

Core-task Taxonomy Papers
30
3
Claimed Contributions
30
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: pre-training multi-channel models for cellular morphology analysis. The field encompasses diverse approaches to learning from multi-channel microscopy images, organized into several major branches. Self-Supervised Pre-Training Architectures explore how to leverage unlabeled data through channel-adaptive vision transformers and masked reconstruction strategies. Supervised Segmentation and Detection methods focus on annotated datasets to delineate cellular structures, while Generative Modeling and Perturbation Prediction aims to synthesize realistic morphologies or predict cellular responses to interventions. Phenotypic Profiling and Drug Discovery Applications emphasize extracting biologically meaningful features for compound screening, and Specialized Imaging Modalities and Reconstruction address technical challenges in advanced microscopy. Clinical and Pathological Applications and Specialized Biological Contexts round out the taxonomy by targeting domain-specific problems such as cancer diagnostics or developmental biology. Within Self-Supervised Pre-Training Architectures, a small cluster of recent works investigates how vision transformers can handle variable numbers of imaging channels without retraining. CHAMMI-75[0] sits squarely in this Channel-Adaptive Vision Transformers niche, alongside Scaling Channel Adaptive[2] and Isolated Channel ViT[3], all exploring flexible encoder designs that generalize across different staining protocols. These methods contrast with earlier supervised pipelines like Cellpose Multi-Modality[14] or phenotypic profiling tools such as PhenoProfiler[7], which typically assume fixed channel configurations. A key open question is whether channel-adaptive pre-training can match or exceed task-specific supervised models when fine-tuned on downstream segmentation or profiling tasks, and how best to balance architectural flexibility with computational efficiency. CHAMMI-75[0] emphasizes scalable pre-training on diverse datasets, positioning itself as a foundation model approach compared to the more narrowly scoped architectures in Isolated Channel ViT[3].

Claimed Contributions

CHAMMI-75 dataset of heterogeneous multi-channel microscopy images

The authors curated CHAMMI-75, a dataset containing 2.8 million multi-channel microscopy images from 75 diverse biological studies. This resource integrates heterogeneous sources with varying numbers of channels (1-7+), organisms, cell lines, and microscopy modalities to enable training of channel-adaptive cellular morphology models.

10 retrieved papers
Can Refute
New benchmarks for multi-channel model evaluation

The authors created two new evaluation benchmarks (CellPHIE with 14-channel images and RBC-MC for cross-domain generalization) alongside adopting existing ones. These benchmarks test model performance on novel channel configurations and imaging modalities not seen during pre-training.

10 retrieved papers
Systematic experimental evaluation of CHAMMI-75 for pre-training

The authors performed comprehensive scaling experiments comparing bag-of-channels versus multi-channel attention approaches, different SSL algorithms, and model sizes. Their results demonstrate that pre-training with CHAMMI-75 improves performance across diverse biological tasks and enables strong generalization to novel channel combinations.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

CHAMMI-75 dataset of heterogeneous multi-channel microscopy images

The authors curated CHAMMI-75, a dataset containing 2.8 million multi-channel microscopy images from 75 diverse biological studies. This resource integrates heterogeneous sources with varying numbers of channels (1-7+), organisms, cell lines, and microscopy modalities to enable training of channel-adaptive cellular morphology models.

Contribution

New benchmarks for multi-channel model evaluation

The authors created two new evaluation benchmarks (CellPHIE with 14-channel images and RBC-MC for cross-domain generalization) alongside adopting existing ones. These benchmarks test model performance on novel channel configurations and imaging modalities not seen during pre-training.

Contribution

Systematic experimental evaluation of CHAMMI-75 for pre-training

The authors performed comprehensive scaling experiments comparing bag-of-channels versus multi-channel attention approaches, different SSL algorithms, and model sizes. Their results demonstrate that pre-training with CHAMMI-75 improves performance across diverse biological tasks and enables strong generalization to novel channel combinations.