Count Bridges enable Modeling and Deconvolving Transcriptomics

ICLR 2026 Conference SubmissionAnonymous Authors
ordinal datadiffusionschrodinger bridgeflow matchingsingle cell genomicsspatial transcriptomics
Abstract:

Many modern biological assays, including RNA sequencing, yield integer-valued counts that reflect the number of RNA molecules detected. These measurements are often not at the desired resolution: while the unit of interest is typically a single cell, many RNA sequencing and imaging technologies produce counts aggregated over sets of cells. Although recent generative frameworks such as diffusion and flow matching have been extended to non-Euclidean and discrete settings, it remains unclear how best to model integer-valued data or how to systematically deconvolve aggregated observations. We introduce Count Bridges, a stochastic bridge process on the integers that provides an exact, tractable analogue of diffusion-style models for count data, with closed-form conditionals for efficient training and sampling. We extend this framework to enable direct training from aggregated measurements via an Expectation-Maximization-style approach that treats unit-level counts as latent variables. We demonstrate state-of-the-art performance on integer distribution matching benchmarks, comparing against flow matching and discrete flow matching baselines across various metrics. We then apply Count Bridges to two large-scale problems in biology: modeling single-cell gene expression data at the nucleotide resolution, with applications to deconvolving bulk RNA-seq, and resolving multicellular spatial transcriptomic spots into single-cell count profiles. Our methods offer a principled foundation for generative modeling and deconvolution of biological count data across scales and modalities.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Count Bridges, a stochastic bridge process on integers designed for generative modeling of count data, with applications to RNA sequencing. It resides in the 'Variational and Flow-Based Single-Cell Models' leaf, which contains four papers total (including this one). This leaf sits within the broader 'Deep Generative Models for Single-Cell Count Data' branch, indicating a moderately active research direction focused on deep learning architectures for discrete transcriptomic data. The taxonomy shows this is a specialized but not overcrowded area, with sibling leaves addressing GAN-based augmentation and semi-supervised approaches.

The taxonomy reveals neighboring research directions that contextualize this work. The parent branch includes GAN-based single-cell augmentation and semi-supervised models, while sibling branches address spatial transcriptomics deconvolution and background noise removal. The broader taxonomy shows parallel efforts in bulk deconvolution (using likelihood-based or deep generative semi-profiling methods) and statistical count models (zero-inflated, temporal, Bayesian nonparametric). Count Bridges diverges from these by focusing on diffusion-style bridge processes for integer data rather than variational autoencoders, GANs, or classical statistical frameworks, positioning it at the intersection of modern generative modeling and discrete data structures.

Among thirty candidates examined, none clearly refute the three core contributions: the Count Bridges framework (ten candidates, zero refutable), the EM-based deconvolution approach (ten candidates, zero refutable), and the biological transcriptomics applications (ten candidates, zero refutable). This suggests that within the limited search scope, the combination of bridge processes on integers, EM-style aggregation handling, and transcriptomics deconvolution appears relatively novel. However, the search examined only top-K semantic matches and citations, not an exhaustive survey of diffusion models, discrete generative methods, or deconvolution literature. The sibling papers in the same leaf (three others) focus on variational and flow-based methods but do not appear to employ bridge processes or EM-based aggregation strategies.

Based on the limited literature search, the work appears to occupy a distinct methodological niche within single-cell generative modeling. The taxonomy structure and contribution-level statistics suggest novelty in the specific combination of techniques, though the analysis covers only a subset of potentially relevant prior work. A broader search across discrete diffusion models, integer-valued stochastic processes, and deconvolution methods outside the top-thirty semantic matches could reveal additional overlaps or precedents not captured here.

Taxonomy

Core-task Taxonomy Papers
22
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Generative modeling and deconvolution of integer-valued biological count data. This field addresses the challenge of modeling discrete count observations that arise throughout biology, from single-cell RNA sequencing to bulk tissue deconvolution. The taxonomy reveals five main branches: generative models for single-cell and spatial transcriptomics, which develop deep learning architectures tailored to the discrete and often zero-inflated nature of transcript counts; bulk transcriptome deconvolution methods that infer cell-type proportions from mixed tissue samples using single-cell references; statistical models for integer-valued count data, encompassing classical and modern approaches to discrete distributions; generative models for non-transcriptomic biological applications, such as flow cytometry and DNA binding data; and domain-specific applications that leverage count models in clinical or specialized biological contexts. Works like CellBender[11] exemplify efforts to remove technical noise from single-cell counts, while Bulk Deconvolution Methods[17] survey strategies for decomposing bulk signals into constituent cell types. Within the single-cell generative modeling branch, a particularly active line of research explores variational and flow-based architectures that respect the integer nature of count data. Count Bridges[0] sits squarely in this cluster, employing diffusion-inspired techniques to generate realistic single-cell count profiles while preserving discrete structure. This contrasts with earlier variational autoencoders like Semisupervised Autoencoder[16] and more recent flow-based approaches such as cellFlow[10] and Flow Matching Bioinformatics[21], which navigate trade-offs between computational efficiency, biological fidelity, and the ability to handle zero inflation. A key open question across these methods is how to balance expressive generative capacity with interpretability and scalability to large-scale spatial transcriptomics datasets, as seen in works like Dependency Aware Spatial[1]. Count Bridges[0] addresses this by leveraging bridge processes that naturally accommodate integer constraints, positioning it as a methodological advance within the variational and flow-based single-cell models cluster.

Claimed Contributions

Count Bridges: stochastic bridge process on integers for count data

The authors propose Count Bridges, a new generative modeling framework using Poisson birth-death dynamics on integer-valued data. This approach provides closed-form conditionals that enable exact sampling and efficient training while preserving the ordinal structure of counts.

10 retrieved papers
EM-based deconvolution framework for aggregated measurements

The authors develop an extension of Count Bridges that enables training directly from aggregated observations by treating individual unit-level counts as latent variables within an EM algorithm framework, allowing systematic deconvolution of aggregated count data.

10 retrieved papers
Applications to biological transcriptomics deconvolution problems

The authors demonstrate Count Bridges on two real-world biological applications: nucleotide-resolution modeling of single-cell gene expression for bulk RNA-seq deconvolution, and resolving spatial transcriptomic measurements into single-cell profiles.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Count Bridges: stochastic bridge process on integers for count data

The authors propose Count Bridges, a new generative modeling framework using Poisson birth-death dynamics on integer-valued data. This approach provides closed-form conditionals that enable exact sampling and efficient training while preserving the ordinal structure of counts.

Contribution

EM-based deconvolution framework for aggregated measurements

The authors develop an extension of Count Bridges that enables training directly from aggregated observations by treating individual unit-level counts as latent variables within an EM algorithm framework, allowing systematic deconvolution of aggregated count data.

Contribution

Applications to biological transcriptomics deconvolution problems

The authors demonstrate Count Bridges on two real-world biological applications: nucleotide-resolution modeling of single-cell gene expression for bulk RNA-seq deconvolution, and resolving spatial transcriptomic measurements into single-cell profiles.

Count Bridges enable Modeling and Deconvolving Transcriptomics | Novelty Validation