Count Bridges enable Modeling and Deconvolving Transcriptomics

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.6 Download Report PDF

ordinal datadiffusionschrodinger bridgeflow matchingsingle cell genomicsspatial transcriptomics

Many modern biological assays, including RNA sequencing, yield integer-valued counts that reflect the number of RNA molecules detected. These measurements are often not at the desired resolution: while the unit of interest is typically a single cell, many RNA sequencing and imaging technologies produce counts aggregated over sets of cells. Although recent generative frameworks such as diffusion and flow matching have been extended to non-Euclidean and discrete settings, it remains unclear how best to model integer-valued data or how to systematically deconvolve aggregated observations. We introduce Count Bridges, a stochastic bridge process on the integers that provides an exact, tractable analogue of diffusion-style models for count data, with closed-form conditionals for efficient training and sampling. We extend this framework to enable direct training from aggregated measurements via an Expectation-Maximization-style approach that treats unit-level counts as latent variables. We demonstrate state-of-the-art performance on integer distribution matching benchmarks, comparing against flow matching and discrete flow matching baselines across various metrics. We then apply Count Bridges to two large-scale problems in biology: modeling single-cell gene expression data at the nucleotide resolution, with applications to deconvolving bulk RNA-seq, and resolving multicellular spatial transcriptomic spots into single-cell count profiles. Our methods offer a principled foundation for generative modeling and deconvolution of biological count data across scales and modalities.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Count Bridges, a stochastic bridge process on integers designed for generative modeling of count data, with applications to RNA sequencing. It resides in the 'Variational and Flow-Based Single-Cell Models' leaf, which contains four papers total (including this one). This leaf sits within the broader 'Deep Generative Models for Single-Cell Count Data' branch, indicating a moderately active research direction focused on deep learning architectures for discrete transcriptomic data. The taxonomy shows this is a specialized but not overcrowded area, with sibling leaves addressing GAN-based augmentation and semi-supervised approaches.

The taxonomy reveals neighboring research directions that contextualize this work. The parent branch includes GAN-based single-cell augmentation and semi-supervised models, while sibling branches address spatial transcriptomics deconvolution and background noise removal. The broader taxonomy shows parallel efforts in bulk deconvolution (using likelihood-based or deep generative semi-profiling methods) and statistical count models (zero-inflated, temporal, Bayesian nonparametric). Count Bridges diverges from these by focusing on diffusion-style bridge processes for integer data rather than variational autoencoders, GANs, or classical statistical frameworks, positioning it at the intersection of modern generative modeling and discrete data structures.

Among thirty candidates examined, none clearly refute the three core contributions: the Count Bridges framework (ten candidates, zero refutable), the EM-based deconvolution approach (ten candidates, zero refutable), and the biological transcriptomics applications (ten candidates, zero refutable). This suggests that within the limited search scope, the combination of bridge processes on integers, EM-style aggregation handling, and transcriptomics deconvolution appears relatively novel. However, the search examined only top-K semantic matches and citations, not an exhaustive survey of diffusion models, discrete generative methods, or deconvolution literature. The sibling papers in the same leaf (three others) focus on variational and flow-based methods but do not appear to employ bridge processes or EM-based aggregation strategies.

Based on the limited literature search, the work appears to occupy a distinct methodological niche within single-cell generative modeling. The taxonomy structure and contribution-level statistics suggest novelty in the specific combination of techniques, though the analysis covers only a subset of potentially relevant prior work. A broader search across discrete diffusion models, integer-valued stochastic processes, and deconvolution methods outside the top-thirty semantic matches could reveal additional overlaps or precedents not captured here.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Generative modeling and deconvolution of integer-valued biological count data. This field addresses the challenge of modeling discrete count observations that arise throughout biology, from single-cell RNA sequencing to bulk tissue deconvolution. The taxonomy reveals five main branches: generative models for single-cell and spatial transcriptomics, which develop deep learning architectures tailored to the discrete and often zero-inflated nature of transcript counts; bulk transcriptome deconvolution methods that infer cell-type proportions from mixed tissue samples using single-cell references; statistical models for integer-valued count data, encompassing classical and modern approaches to discrete distributions; generative models for non-transcriptomic biological applications, such as flow cytometry and DNA binding data; and domain-specific applications that leverage count models in clinical or specialized biological contexts. Works like CellBender[11] exemplify efforts to remove technical noise from single-cell counts, while Bulk Deconvolution Methods[17] survey strategies for decomposing bulk signals into constituent cell types. Within the single-cell generative modeling branch, a particularly active line of research explores variational and flow-based architectures that respect the integer nature of count data. Count Bridges[0] sits squarely in this cluster, employing diffusion-inspired techniques to generate realistic single-cell count profiles while preserving discrete structure. This contrasts with earlier variational autoencoders like Semisupervised Autoencoder[16] and more recent flow-based approaches such as cellFlow[10] and Flow Matching Bioinformatics[21], which navigate trade-offs between computational efficiency, biological fidelity, and the ability to handle zero inflation. A key open question across these methods is how to balance expressive generative capacity with interpretability and scalability to large-scale spatial transcriptomics datasets, as seen in works like Dependency Aware Spatial[1]. Count Bridges[0] addresses this by leveraging bridge processes that naturally accommodate integer constraints, positioning it as a methodological advance within the variational and flow-based single-cell models cluster.

Claimed Contributions

Count Bridges: stochastic bridge process on integers for count data

10 retrieved papers

The authors propose Count Bridges, a new generative modeling framework using Poisson birth-death dynamics on integer-valued data. This approach provides closed-form conditionals that enable exact sampling and efficient training while preserving the ordinal structure of counts.

10 retrieved papers

EM-based deconvolution framework for aggregated measurements

10 retrieved papers

The authors develop an extension of Count Bridges that enables training directly from aggregated observations by treating individual unit-level counts as latent variables within an EM algorithm framework, allowing systematic deconvolution of aggregated count data.

10 retrieved papers

Applications to biological transcriptomics deconvolution problems

10 retrieved papers

The authors demonstrate Count Bridges on two real-world biological applications: nucleotide-resolution modeling of single-cell gene expression for bulk RNA-seq deconvolution, and resolving spatial transcriptomic measurements into single-cell profiles.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[10] cellFlow: a generative flow-based model for single-cell count data PDF

A Palma, T Richter, H Zhang, A Dittadi (2024)

[21] Flow matching for generative modeling in bioinformatics and computational biology PDF

A Morehead, L Atanackovic, A Hegde, Y Wang (0)

[22] COUNT BRIDGES ENABLE MODELING AND DECONVOLVING TRANSCRIPTOMIC DATA PDF

DT DATA (0)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Count Bridges: stochastic bridge process on integers for count data

[33] Poisson representation: a bridge between discrete and continuous models of stochastic gene regulatory networks PDF

Cannot Refute

[34] Adversarial schrÃ¶dinger bridge matching PDF

Cannot Refute

[35] CIR bridge for modeling of fish migration on sub-hourly scale PDF

Cannot Refute

[36] Non-negative diffusion bridge of the McKean-Vlasov type: analysis of singular diffusion and application to fish migration PDF

Cannot Refute

[37] Discrete Diffusion Schr" odinger Bridge Matching for Graph Transformation PDF

Cannot Refute

[38] Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion PDF

Cannot Refute

[39] Time-reversible bridges of data with machine learning PDF

Cannot Refute

[40] Diffusion bridge with misspecification: theory construction and application to high-resolution fish count data PDF

Cannot Refute

[41] Modelling breakage-fusion-bridge cycles as a stochastic paper folding process PDF

Cannot Refute

[42] A Generalized Stochastic Process for Count Data PDF

Cannot Refute

Contribution

EM-based deconvolution framework for aggregated measurements

[23] EMixed: probabilistic multi-omics cellular deconvolution of bulk omics data PDF

Cannot Refute

[24] Parameter estimation for grouped data using EM and MCEM algorithms PDF

Cannot Refute

[25] Understanding urban mobility patterns with a probabilistic tensor factorization framework PDF

Cannot Refute

[26] An expectation-maximization algorithm for logistic regression based on individual-level predictors and aggregate-level response PDF

Cannot Refute

[27] Maximum pseudo likelihood estimation in network tomography PDF

Cannot Refute

[28] Optimization models for estimating transit network originâdestination flows with big transit data PDF

Cannot Refute

[29] Multiple target counting and localization using variational Bayesian EM algorithm in wireless sensor networks PDF

Cannot Refute

[30] A Robust Functional EM Algorithm for Incomplete Panel Count Data PDF

Cannot Refute

[31] A Functional EM Algorithm for Panel Count Data with Missing Counts. PDF

Cannot Refute

[32] A disaggregate negative binomial regression procedure for count data analysis PDF

Cannot Refute

Contribution

Applications to biological transcriptomics deconvolution problems

[43] Profiling cell identity and tissue architecture with single-cell and spatial transcriptomics PDF

Cannot Refute

[44] Exploring tissue architecture using spatial transcriptomics PDF

Cannot Refute

[45] Cell-type deconvolution methods for spatial transcriptomics PDF

Cannot Refute

[46] Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology PDF

Cannot Refute

[47] SpatialDWLS: accurate deconvolution of spatial transcriptomic data PDF

Cannot Refute

[48] Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics PDF

Cannot Refute

[49] Deciphering the tumor immune microenvironment: single-cell and spatial transcriptomic insights into cervical cancer fibroblasts PDF

Cannot Refute

[50] Precise gene expression deconvolution in spatial transcriptomics with STged PDF

Cannot Refute

[51] Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST PDF

Cannot Refute

[52] Sparse deconvolution of cell type medleys in spatial transcriptomics PDF

Cannot Refute

Count Bridges enable Modeling and Deconvolving Transcriptomics

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[10] cellFlow: a generative flow-based model for single-cell count data PDF

[21] Flow matching for generative modeling in bioinformatics and computational biology PDF

[22] COUNT BRIDGES ENABLE MODELING AND DECONVOLVING TRANSCRIPTOMIC DATA PDF

Contribution Analysis

Count Bridges: stochastic bridge process on integers for count data

[33] Poisson representation: a bridge between discrete and continuous models of stochastic gene regulatory networks PDF

[34] Adversarial schrÃ¶dinger bridge matching PDF

[35] CIR bridge for modeling of fish migration on sub-hourly scale PDF

[36] Non-negative diffusion bridge of the McKean-Vlasov type: analysis of singular diffusion and application to fish migration PDF

[37] Discrete Diffusion Schr" odinger Bridge Matching for Graph Transformation PDF

[38] Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion PDF

[39] Time-reversible bridges of data with machine learning PDF

[40] Diffusion bridge with misspecification: theory construction and application to high-resolution fish count data PDF

[41] Modelling breakage-fusion-bridge cycles as a stochastic paper folding process PDF

[42] A Generalized Stochastic Process for Count Data PDF

EM-based deconvolution framework for aggregated measurements

[23] EMixed: probabilistic multi-omics cellular deconvolution of bulk omics data PDF

[24] Parameter estimation for grouped data using EM and MCEM algorithms PDF

[25] Understanding urban mobility patterns with a probabilistic tensor factorization framework PDF

[26] An expectation-maximization algorithm for logistic regression based on individual-level predictors and aggregate-level response PDF

[27] Maximum pseudo likelihood estimation in network tomography PDF

[28] Optimization models for estimating transit network originâdestination flows with big transit data PDF

[29] Multiple target counting and localization using variational Bayesian EM algorithm in wireless sensor networks PDF

[30] A Robust Functional EM Algorithm for Incomplete Panel Count Data PDF

[31] A Functional EM Algorithm for Panel Count Data with Missing Counts. PDF

[32] A disaggregate negative binomial regression procedure for count data analysis PDF

Applications to biological transcriptomics deconvolution problems

[43] Profiling cell identity and tissue architecture with single-cell and spatial transcriptomics PDF

[44] Exploring tissue architecture using spatial transcriptomics PDF

[45] Cell-type deconvolution methods for spatial transcriptomics PDF

[46] Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology PDF

[47] SpatialDWLS: accurate deconvolution of spatial transcriptomic data PDF

[48] Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics PDF

[49] Deciphering the tumor immune microenvironment: single-cell and spatial transcriptomic insights into cervical cancer fibroblasts PDF

[50] Precise gene expression deconvolution in spatial transcriptomics with STged PDF

[51] Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST PDF

[52] Sparse deconvolution of cell type medleys in spatial transcriptomics PDF

Table of Contents

[28] Optimization models for estimating transit network originâdestination flows with big transit data PDF