PairFlow: Closed-Form Source-Target Coupling for Few-Step Generation in Discrete Flow Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

ReFlowFlow matchingRectified flow

We introduce $\texttt{PairFlow}$ , a lightweight preprocessing step for training Discrete Flow Models (DFMs) to achieve few-step sampling without requiring a pretrained teacher. DFMs have recently emerged as a new class of generative models for discrete data, offering strong performance. However, they suffer from slow sampling due to their iterative nature. Existing acceleration methods largely depend on finetuning, which introduces substantial additional training overhead. $\texttt{PairFlow}$ addresses this issue with a lightweight preprocessing step. Inspired by ReFlow and its extension to DFMs, we train DFMs from coupled samples of source and target distributions, without requiring any pretrained teacher. At the core of our approach is a closed-form inversion for DFMs, which allows efficient construction of paired source–target samples. Despite its extremely low cost, taking only up to 1.7% of the compute needed for full model training, $\texttt{PairFlow}$ matches or even surpasses the performance of two-stage training involving finetuning. Furthermore, models trained with our framework provide stronger base models for subsequent distillation, yielding further acceleration after finetuning. Experiments on molecular data as well as binary and RGB images demonstrate the broad applicability and effectiveness of our approach.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces PairFlow, a lightweight preprocessing method that trains discrete flow models from coupled source-target samples to enable few-step generation without a pretrained teacher. It resides in the 'Closed-Form and Lightweight Coupling Methods' leaf, which contains only two papers total: PairFlow itself and ReDi. This leaf sits within the broader 'Acceleration via Source-Target Coupling Strategies' branch, which also includes optimal transport-based and model-aligned coupling approaches. The small number of sibling papers suggests this is a relatively sparse research direction focused specifically on computationally inexpensive coupling strategies.

The taxonomy reveals that PairFlow's immediate neighbors include optimal transport methods that minimize geometric distances and model-aligned techniques that optimize for learning objectives. These sibling branches contain single papers each, indicating that acceleration via coupling is an emerging area with multiple competing paradigms. The broader taxonomy also shows domain-specific applications (graphs, language, biology) and foundational discrete flow frameworks, but PairFlow's position in the acceleration branch distinguishes it from pure formulation work. The scope notes clarify that this leaf excludes both geometric optimal transport and model-aligned methods, focusing narrowly on closed-form inversions and lightweight preprocessing.

Among the three contributions analyzed, the core PairFlow preprocessing approach examined ten candidates and found one potentially refuting prior work, suggesting some overlap with existing lightweight coupling ideas. The closed-form inversion contribution examined three candidates with no clear refutations, indicating this technical component may be more novel. The backward velocity field for pair discovery examined ten candidates without refutation, also appearing relatively fresh. Given the limited search scope of twenty-three total candidates examined across all contributions, these statistics suggest moderate novelty with some prior work in the preprocessing domain but less overlap in the specific technical mechanisms.

Based on the top-23 semantic matches examined, PairFlow appears to occupy a sparsely populated niche within discrete flow acceleration. The single sibling paper and limited refutations across most contributions suggest the work introduces distinct technical ideas, though the preprocessing concept itself has some precedent. The analysis does not cover exhaustive literature review or broader distillation methods, so the assessment reflects novelty within the examined coupling-focused subset of the field.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Accelerating discrete flow models through source-target coupling. The field encompasses diverse approaches to modeling and accelerating flows over discrete state spaces, with the taxonomy revealing several major branches. Core frameworks establish foundational formulations for discrete flow models, including methods like Discrete Flow Matching[1] and Integer Discrete Flows[5] that define how probability mass evolves. Acceleration strategies focus on coupling techniques that link source and target distributions more efficiently, ranging from closed-form lightweight methods to model-aligned approaches such as Model-Aligned Coupling[3]. Domain-specific applications adapt these ideas to particular settings, while other branches address coupled physical systems like Pebble Bed Reactor[4] simulations, data-driven network modeling including EV Charging Fusion[10], and specialized biomedical or geophysical domains such as Tumor Angiogenesis Model[17] and Geophysics for Archaeology[16]. This structure reflects both methodological innovation in coupling strategies and the breadth of practical contexts where discrete flows arise. Recent work has concentrated on making discrete flow generation faster and more sample-efficient, with several contrasting lines emerging. Lightweight coupling methods seek closed-form or computationally inexpensive ways to bridge source and target, as exemplified by ReDi[2] and PairFlow[0], which aim to reduce the number of sampling steps without heavy optimization overhead. In contrast, approaches like Sinkhorn Couplings[6] and model-aligned techniques invest more computation upfront to obtain tighter couplings that can yield better sample quality. PairFlow[0] sits squarely within the closed-form and lightweight coupling branch, sharing ReDi[2]'s emphasis on efficiency but differing in how it constructs the pairing between distributions. Compared to Model-Aligned Coupling[3], PairFlow[0] trades off some alignment precision for speed, reflecting an ongoing tension in the field between computational cost and the fidelity of the learned coupling.

Claimed Contributions

PairFlow: Lightweight preprocessing for few-step discrete flow generation

Can Refute

10 retrieved papers

The authors propose PairFlow, a training framework that enables few-step sampling in discrete flow models by constructing paired source-target samples during a lightweight preprocessing phase. This approach eliminates the need for pretrained teacher models and achieves acceleration without finetuning, requiring only up to 1.7% of the compute needed for full model training.

10 retrieved papers

Can Refute

Closed-form inversion for discrete flow models

3 retrieved papers

The authors derive closed-form expressions for both forward and backward velocity fields in discrete flow models. These closed-form velocities, determined by Hamming distance, enable efficient simulation of probability paths and construction of source-target pairs without requiring iterative sampling from a pretrained model.

3 retrieved papers

Backward velocity field for efficient pair discovery

10 retrieved papers

The authors introduce a closed-form backward velocity field that inverts data samples toward the source distribution. This backward simulation guarantees coverage of all data points and produces source-target pairs with lower Hamming distances, promoting straighter probability paths during training.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] ReDi: Rectified Discrete Flow PDF

Yoo, Jaehoon, Kim Won-Jung, Jaehoon Yoo, Hong, Seunghoon, Wonjung Kim, Seunghoon Hong (2025) • arXiv.org

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

PairFlow: Lightweight preprocessing for few-step discrete flow generation

[2] ReDi: Rectified Discrete Flow PDF

Can Refute

[24] One Step Diffusion via Shortcut Models PDF

Cannot Refute

[25] Remasking Discrete Diffusion Models with Inference-Time Scaling PDF

Cannot Refute

[26] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps PDF

Cannot Refute

[27] DeFoG: Discrete Flow Matching for Graph Generation PDF

Cannot Refute

[28] FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation PDF

Cannot Refute

[29] Learning Few-Step Diffusion Models by Trajectory Distribution Matching PDF

Cannot Refute

[30] Generative Flow Networks for Discrete Probabilistic Modeling PDF

Cannot Refute

[31] Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models PDF

Cannot Refute

[32] Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling PDF

Cannot Refute

Contribution

Closed-form inversion for discrete flow models

[21] Discrete langevin samplers via wasserstein gradient flow PDF

Cannot Refute

[22] Harmonization shared autoencoder gaussian process latent variable model with relaxed hamming distance PDF

Cannot Refute

[23] Efficient joint segmentation, occlusion labeling, stereo and flow estimation PDF

Cannot Refute

Contribution

Backward velocity field for efficient pair discovery

[33] A fractional Tikhonov regularization method for an inverse backward and source problems in the time-space fractional diffusion equations PDF

Cannot Refute

[34] Automated reverse engineering of nonlinear dynamical systems PDF

Cannot Refute

[35] â¦ location of indoor instantaneous air contaminant source through multi-zone model-based probability method by utilizing airflow data from coarse-grid CFD model PDF

Cannot Refute

[36] Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling PDF

Cannot Refute

[37] Source-receptor matrix calculation with a Lagrangian particle dispersion model in backward mode PDF

Cannot Refute

[38] A nearâfield tool for simulating the upstream influence of atmospheric observations: The Stochastic TimeâInverted Lagrangian Transport (STILT) model PDF

Cannot Refute

[39] Inverse methods in physical oceanography PDF

Cannot Refute

[40] Research on the influence analysis of Radar traveling time tomography in mine and parameter optimization PDF

Cannot Refute

[41] A variational inverse method for the reconstruction of general circulation fields in the northern Bering Sea PDF

Cannot Refute

[42] Regional source identification using Lagrangian stochastic particle dispersion and HYSPLIT backward-trajectory models PDF

Cannot Refute

PairFlow: Closed-Form Source-Target Coupling for Few-Step Generation in Discrete Flow Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] ReDi: Rectified Discrete Flow PDF

Contribution Analysis

PairFlow: Lightweight preprocessing for few-step discrete flow generation

[2] ReDi: Rectified Discrete Flow PDF

[24] One Step Diffusion via Shortcut Models PDF

[25] Remasking Discrete Diffusion Models with Inference-Time Scaling PDF

[26] DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps PDF

[27] DeFoG: Discrete Flow Matching for Graph Generation PDF

[28] FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation PDF

[29] Learning Few-Step Diffusion Models by Trajectory Distribution Matching PDF

[30] Generative Flow Networks for Discrete Probabilistic Modeling PDF

[31] Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models PDF

[32] Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling PDF

Closed-form inversion for discrete flow models

[21] Discrete langevin samplers via wasserstein gradient flow PDF

[22] Harmonization shared autoencoder gaussian process latent variable model with relaxed hamming distance PDF

[23] Efficient joint segmentation, occlusion labeling, stereo and flow estimation PDF

Backward velocity field for efficient pair discovery

[33] A fractional Tikhonov regularization method for an inverse backward and source problems in the time-space fractional diffusion equations PDF

[34] Automated reverse engineering of nonlinear dynamical systems PDF

[35] â¦ location of indoor instantaneous air contaminant source through multi-zone model-based probability method by utilizing airflow data from coarse-grid CFD model PDF

[36] Koopman Invertible Autoencoder: Leveraging Forward and Backward Dynamics for Temporal Modeling PDF

[37] Source-receptor matrix calculation with a Lagrangian particle dispersion model in backward mode PDF

[38] A nearâfield tool for simulating the upstream influence of atmospheric observations: The Stochastic TimeâInverted Lagrangian Transport (STILT) model PDF

[39] Inverse methods in physical oceanography PDF

[40] Research on the influence analysis of Radar traveling time tomography in mine and parameter optimization PDF

[41] A variational inverse method for the reconstruction of general circulation fields in the northern Bering Sea PDF

[42] Regional source identification using Lagrangian stochastic particle dispersion and HYSPLIT backward-trajectory models PDF

Table of Contents

[35] â¦ location of indoor instantaneous air contaminant source through multi-zone model-based probability method by utilizing airflow data from coarse-grid CFD model PDF

[38] A nearâfield tool for simulating the upstream influence of atmospheric observations: The Stochastic TimeâInverted Lagrangian Transport (STILT) model PDF