VADv2: End-to-End Autonomous Driving via Probabilistic Planning

ICLR 2026 Conference SubmissionAnonymous Authors
Autonomous DrivingEnd-to-endProbabilistic PlanningClosed-LoopPlanning Vocabulary
Abstract:

Learning a human-like driving policy from large-scale driving demonstrations is promising, but the uncertainty and non-deterministic nature of planning make it challenging. Existing learning-based planning methods follow a deterministic paradigm to directly regress the action, failing to cope with the uncertainty problem. In this work, we propose a probabilistic planning model for end-to-end autonomous driving, termed VADv2. We resort to a probabilistic field function to model the mapping from the action space to the probabilistic distribution. Since the planning action space is a high-dimensional continuous spatiotemporal space and hard to tackle, we first discretize the planning action space to a large planning vocabulary and then tokenize the planning vocabulary into planning tokens. Planning tokens interact with scene tokens and output the probabilistic distribution of action. Mass driving demonstrations are leveraged to supervise the distribution. VADv2 achieves state-of-the-art closed-loop performance on the CARLA Town05 benchmark, significantly outperforming all existing methods. We also provide comprehensive evaluations on the NAVSIM dataset and a large-scale 3DGS-based benchmark, demonstrating its effectiveness in real-world applications. Code will be released to facilitate future research.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes VADv2, an end-to-end autonomous driving system that models planning as a probabilistic field function over discretized action tokens. It resides in the 'Probabilistic Action Distribution Models' leaf under 'End-to-End Learning Architectures', which contains only two papers total. This sparse population suggests the specific approach of tokenizing continuous spatiotemporal action spaces into discrete vocabularies for probabilistic planning remains relatively unexplored. The taxonomy reveals this is one focused direction within a broader landscape of end-to-end methods, contrasting with neighboring leaves like 'Generative Planning Models' and 'Multimodal Foundation Model Integration' that pursue different architectural strategies.

The taxonomy structure shows VADv2 sits adjacent to several related but distinct research directions. Neighboring leaves include 'Generative Planning Models' using diffusion or GANs for trajectory distributions, 'Deterministic End-to-End Models' that regress actions without probabilistic modeling, and 'Uncertainty-Aware Representation Learning' focusing on world models. The broader 'Modular Planning Frameworks' branch offers an alternative philosophy with explicit perception-prediction-planning separation. VADv2's positioning emphasizes learning probabilistic distributions directly from demonstrations within a unified architecture, diverging from both deterministic regression approaches and modular systems that maintain structured intermediate representations for interpretability.

Among the thirty candidates examined, the 'VADv2 end-to-end driving model with action space tokenization' contribution shows the most substantial prior work overlap, with three refutable candidates identified from ten examined. The other two contributions—the probabilistic planning paradigm and benchmark performance claims—found no clear refutations among their respective ten-candidate searches. This pattern suggests the core architectural innovation of action space tokenization has more direct precedents in the limited search scope, while the broader framing as probabilistic planning and the empirical results appear less directly challenged. The analysis explicitly covers top-K semantic matches and citation expansion, not an exhaustive literature review.

Based on the limited thirty-candidate search, VADv2 appears to occupy a sparsely populated research direction with one sibling paper in its taxonomy leaf. The tokenization approach shows measurable prior work within the examined scope, while the probabilistic planning framing and performance claims lack clear refutations among candidates reviewed. The taxonomy context reveals this work contributes to ongoing debates between monolithic learned models and hybrid systems, though the search scope cannot definitively assess novelty across all related end-to-end or modular planning literature.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
3
Refutable Paper

Research Landscape Overview

Core task: end-to-end autonomous driving via probabilistic planning. The field encompasses diverse strategies for handling uncertainty in autonomous vehicle decision-making, organized into several major branches. End-to-End Learning Architectures leverage neural networks to map sensor inputs directly to control outputs, often incorporating probabilistic action distributions to capture multimodal driving behaviors. Modular Planning Frameworks decompose the problem into perception, prediction, and planning stages, enabling explicit reasoning about uncertainty at each step. Optimization-Based Planning Under Uncertainty employs techniques like model predictive control with chance constraints, while Sampling-Based Motion Planning uses methods such as probabilistic roadmaps and RRT variants to explore feasible trajectories. Reinforcement Learning Approaches frame driving as sequential decision-making under uncertainty, and Game-Theoretic and Competitive Planning models interactions with other agents. Specialized Application Domains address contexts like racing or intersections, and Survey and Review Literature synthesizes progress across these areas. Recent work reveals contrasting philosophies in balancing model complexity, interpretability, and real-time performance. End-to-end methods like VADv2[0] and VADv2 Vectorized[1] emphasize learning probabilistic action distributions directly from data, capturing multimodal futures without explicit intermediate representations. This contrasts with modular approaches such as Interaction-Aware Probabilistic[3] or Bridging Past Future[2], which maintain structured prediction and planning stages to ensure interpretability and safety guarantees. Meanwhile, optimization-based methods balance computational efficiency with uncertainty quantification, and reinforcement learning works like Coupled RL Risk[5] explore risk-sensitive policies. VADv2[0] sits within the Probabilistic Action Distribution Models cluster of end-to-end architectures, sharing with VADv2 Vectorized[1] an emphasis on learning compact representations of action uncertainty, yet differing in how scene context is encoded and how multimodal outputs are generated. This positioning reflects ongoing debates about whether probabilistic planning is best achieved through monolithic learned models or through hybrid systems that preserve explicit reasoning about uncertainty.

Claimed Contributions

Probabilistic planning paradigm for end-to-end autonomous driving

The authors introduce a probabilistic planning approach that models the planning policy as a scene-conditioned nonstationary stochastic process p(a|o), using a probabilistic field function to map actions to probability distributions. This addresses the uncertainty inherent in planning by learning from large-scale driving demonstrations, unlike deterministic methods that directly regress actions.

10 retrieved papers
VADv2 end-to-end driving model with action space tokenization

The authors present VADv2, which discretizes the high-dimensional continuous planning action space into a planning vocabulary, tokenizes both sensor data and planning actions, uses Transformer-based interaction between planning tokens and scene tokens, and samples actions from the learned probability distribution for vehicle control.

10 retrieved papers
Can Refute
State-of-the-art planning performance across multiple benchmarks

The authors demonstrate that VADv2 achieves state-of-the-art results on CARLA Town05, NAVSIM, and a 3DGS-based benchmark in both closed-loop and open-loop evaluation settings, with validation through extensive simulations and real-world deployment showing effectiveness and stability.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Probabilistic planning paradigm for end-to-end autonomous driving

The authors introduce a probabilistic planning approach that models the planning policy as a scene-conditioned nonstationary stochastic process p(a|o), using a probabilistic field function to map actions to probability distributions. This addresses the uncertainty inherent in planning by learning from large-scale driving demonstrations, unlike deterministic methods that directly regress actions.

Contribution

VADv2 end-to-end driving model with action space tokenization

The authors present VADv2, which discretizes the high-dimensional continuous planning action space into a planning vocabulary, tokenizes both sensor data and planning actions, uses Transformer-based interaction between planning tokens and scene tokens, and samples actions from the learned probability distribution for vehicle control.

Contribution

State-of-the-art planning performance across multiple benchmarks

The authors demonstrate that VADv2 achieves state-of-the-art results on CARLA Town05, NAVSIM, and a 3DGS-based benchmark in both closed-loop and open-loop evaluation settings, with validation through extensive simulations and real-world deployment showing effectiveness and stability.