CMT: Mid-Training for Efficient Learning of Consistency, Mean Flow, and Flow-Map Models
Overview
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors propose CMT, a novel three-stage training pipeline that adds a compact mid-training phase between diffusion pre-training and flow map post-training. This stage trains a model to map points along solver trajectories directly to clean samples, providing a trajectory-consistent initialization that improves stability and convergence without requiring heuristics like stop-gradients or custom time weighting.
The authors introduce a unified view connecting existing flow map formulations (Consistency Models and Mean Flow) through a reverse-time generative perspective. This reinterpretation clarifies the oracle objectives and motivates the design of CMT's training losses for both special (Ψt→0) and general (Ψt→s) flow maps.
The authors provide theoretical analysis demonstrating that CMT initialization yields gradient bias of O(ε + Δt²), significantly lower than diffusion-based initialization which incurs additional bias terms from forward noising and posterior mean mismatch. This formalizes why CMT provides a more robust starting point for flow map training.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[35] Pre-Training and Fine-Tuning Generative Flow Networks PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Consistency Mid-Training (CMT) framework
The authors propose CMT, a novel three-stage training pipeline that adds a compact mid-training phase between diffusion pre-training and flow map post-training. This stage trains a model to map points along solver trajectories directly to clean samples, providing a trajectory-consistent initialization that improves stability and convergence without requiring heuristics like stop-gradients or custom time weighting.
[51] Adversarial diffusion distillation PDF
[52] Score identity distillation: Exponentially fast distillation of pretrained diffusion models for one-step generation PDF
[53] Di o: Distilling masked diffusion models into one-step generator PDF
[54] Knowledge diffusion for distillation PDF
[55] Distilling diffusion models into conditional gans PDF
[56] Composition and control with distilled energy diffusion models and sequential monte carlo PDF
[57] Continual learning of diffusion models with generative distillation PDF
[58] Diff-instruct: A universal approach for transferring knowledge from pre-trained diffusion models PDF
[59] One-step diffusion distillation via deep equilibrium models PDF
[60] Dreamteacher: Pretraining image backbones with deep generative models PDF
Unified formulation of flow map objectives
The authors introduce a unified view connecting existing flow map formulations (Consistency Models and Mean Flow) through a reverse-time generative perspective. This reinterpretation clarifies the oracle objectives and motivates the design of CMT's training losses for both special (Ψt→0) and general (Ψt→s) flow maps.
[9] Flow-anchored consistency models PDF
[18] Modular MeanFlow: Towards Stable and Scalable One-Step Generative Modeling PDF
[20] Flow map matching PDF
[61] Flowpolicy: Enabling fast and robust 3d flow-based policy via consistency flow matching for robot manipulation PDF
[62] High-order flow matching: Unified framework and sharp statistical rates PDF
[64] Efficient Image Restoration via Latent Consistency Flow Matching PDF
[65] Towards a Unified Framework for Consistency Generative Modeling PDF
[66] Inverse Flow and Consistency Models PDF
[67] UniConFlow: A Unified Constrained Generalization Framework for Certified Motion Planning with Flow Matching Models PDF
Theoretical analysis of gradient bias reduction
The authors provide theoretical analysis demonstrating that CMT initialization yields gradient bias of O(ε + Δt²), significantly lower than diffusion-based initialization which incurs additional bias terms from forward noising and posterior mean mismatch. This formalizes why CMT provides a more robust starting point for flow map training.