AirV2X: Unified Air-Ground Vehicle-to-Everything Collaboration

ICLR 2026 Conference Withdrawn SubmissionXiangbo Gao, Yuheng Wu, Fengze Yang, Xuewen Luo, Keshu Wu, Xinghao Chen, Yuping Wang, Chenxi Liu, Yang Zhou, Zhengzhong Tu
Autonomous DrivingCollaborative PerceptionLow altitude economy
Abstract:

While multi-vehicular collaborative driving demonstrates clear advantages over single-vehicle autonomy, traditional infrastructure-based V2X systems remain constrained by substantial deployment costs and the creation of "uncovered danger zones" in rural and suburban areas. We present AirV2X-Perception, a large-scale dataset that leverages Unmanned Aerial Vehicles (UAVs) as a flexible alternative or complement to fixed Road-Side Units (RSUs). Drones offer unique advantages over ground-based perception: complementary bird's-eye-views that reduce occlusions, dynamic positioning capabilities that enable hovering, patrolling, and escorting navigation rules, and significantly lower deployment costs compared to fixed infrastructure. Our dataset comprises 6.73 hours of drone-assisted driving scenarios across urban, suburban, and rural environments with varied weather and lighting conditions. The AirV2X-Perception dataset facilitates the development and standardized evaluation of Vehicle-to-Drone (V2D) algorithms, addressing a critical gap in the rapidly expanding field of aerial-assisted autonomous driving systems. The dataset and development kits are open-sourced at https://anonymous.4open.science/r/AirV2X-Perception-BBA7.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces AirV2X-Perception, a large-scale dataset for drone-assisted V2X collaborative perception, comprising 6.73 hours of multi-environment driving scenarios. It resides in the Multi-Agent Perception Fusion Frameworks leaf, which contains six papers including the original work. This leaf sits within the broader Collaborative Perception Architectures and Algorithms branch, indicating a moderately populated research direction focused on integrating UAV and vehicle sensor data for enhanced situational awareness.

The taxonomy reveals neighboring work in Communication-Efficient Collaborative Perception (two papers on bandwidth optimization) and adjacent branches addressing UAV Deployment, Network Infrastructure, and Integrated Sensing. The paper's emphasis on drone navigation strategies (hovering, patrolling, escorting) connects to trajectory optimization research, while its dataset contribution bridges perception fusion and deployment planning. The taxonomy's scope notes clarify that perception-specific bandwidth optimization belongs elsewhere, positioning this work at the intersection of sensing and deployment concerns.

Among thirty candidates examined, none clearly refute the three core contributions: the dataset itself (ten candidates examined, zero refutable), the three navigation strategies (ten examined, zero refutable), and the benchmark evaluation framework (ten examined, zero refutable). This suggests that within the limited search scope, the combination of a drone-centric V2X dataset with explicit navigation modes and standardized benchmarks appears relatively underexplored. However, the analysis covers top-K semantic matches rather than exhaustive field coverage, leaving open the possibility of related work outside this candidate pool.

Based on the limited literature search, the work appears to occupy a niche intersection of dataset provision, navigation strategy design, and benchmark standardization for aerial-assisted V2X perception. The taxonomy structure indicates this is an active but not overcrowded area, with the dataset's scale and multi-environment coverage potentially distinguishing it from existing simulation platforms and smaller-scale collections. The absence of refutable candidates among thirty examined suggests novelty within the search scope, though broader field coverage would strengthen this assessment.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: drone-assisted vehicle-to-everything collaborative perception. This emerging field integrates unmanned aerial vehicles (UAVs) with ground-based vehicular networks to enhance situational awareness and safety through shared sensing data. The taxonomy reveals a multifaceted landscape organized around seven main branches. Collaborative Perception Architectures and Algorithms address how multi-agent systems fuse heterogeneous sensor streams, with works like Where2comm[1] and Uvcpnet[8] exploring spatial attention and feature-level fusion strategies. Network Infrastructure and Communication Protocols tackle the underlying connectivity challenges, including MAC protocols and V2X standards. UAV Deployment and Trajectory Optimization focuses on dynamic positioning to maximize coverage and minimize latency, as seen in studies like Latency Analysis of Drone-Assisted[5] and various reinforcement learning-based path planners. Integrated Sensing and Communication merges radar and communication functions to improve spectral efficiency, while Edge Computing and Data Management explores offloading and distributed learning frameworks. Application-Specific Systems demonstrate real-world use cases ranging from traffic monitoring to emergency response, and Surveys and Vision Papers provide broader contextual reviews such as A Survey on UAV-Enabled[6]. Within this landscape, a particularly active line of work centers on multi-agent perception fusion frameworks, where the challenge is to efficiently aggregate observations from both aerial and ground agents under bandwidth and latency constraints. AirV2X[0] situates itself squarely in this branch, emphasizing collaborative perception architectures that leverage UAV vantage points to overcome occlusions and extend sensing range for connected vehicles. Compared to neighboring works like Horus[28], which focuses on hierarchical fusion pipelines, and Multi-agent Collaborative Perception for[30], which explores decentralized coordination strategies, AirV2X[0] appears to prioritize the unique geometric and communication trade-offs introduced by aerial mobility. Meanwhile, simulation platforms such as V2x-sim[36] provide testbeds for evaluating these fusion algorithms, and broader surveys like Unmanned aerial vehicle-aided intelligent[2] contextualize the role of UAVs across diverse intelligent transportation scenarios. Open questions remain around scalability, real-time fusion under dynamic topologies, and the interplay between trajectory optimization and perception quality.

Claimed Contributions

AirV2X-Perception dataset for drone-assisted V2X collaborative perception

The authors introduce a large-scale simulated dataset comprising 6.73 hours of drone-assisted driving scenarios across diverse environments, weather conditions, and lighting. The dataset integrates vehicles, roadside units, and drones with multiple sensors to facilitate development and evaluation of Vehicle-to-Drone algorithms.

10 retrieved papers
Three distinct drone navigation strategies for V2X perception

The authors design and implement three drone navigation strategies (hover, patrol, and escort) to provide comprehensive evaluation of V2D algorithms under different operational modes, each with distinct advantages for real-world deployment scenarios.

10 retrieved papers
Comprehensive benchmark evaluation of V2X collaborative perception algorithms

The authors provide systematic benchmark evaluations of six representative collaborative perception algorithms across multiple dimensions including 3D object detection, BEV semantic segmentation, computational efficiency, and robustness to environmental conditions and sensor degradation.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

AirV2X-Perception dataset for drone-assisted V2X collaborative perception

The authors introduce a large-scale simulated dataset comprising 6.73 hours of drone-assisted driving scenarios across diverse environments, weather conditions, and lighting. The dataset integrates vehicles, roadside units, and drones with multiple sensors to facilitate development and evaluation of Vehicle-to-Drone algorithms.

Contribution

Three distinct drone navigation strategies for V2X perception

The authors design and implement three drone navigation strategies (hover, patrol, and escort) to provide comprehensive evaluation of V2D algorithms under different operational modes, each with distinct advantages for real-world deployment scenarios.

Contribution

Comprehensive benchmark evaluation of V2X collaborative perception algorithms

The authors provide systematic benchmark evaluations of six representative collaborative perception algorithms across multiple dimensions including 3D object detection, BEV semantic segmentation, computational efficiency, and robustness to environmental conditions and sensor degradation.