Cyber-Zero: Training Cybersecurity Agents without Runtime
Overview
Overall Novelty Assessment
The paper introduces Cyber-Zero, a runtime-free framework that synthesizes agent trajectories from CTF writeups to train cybersecurity LLMs. According to the taxonomy, this work occupies the 'Runtime-Free Trajectory Synthesis' leaf under 'Offensive Security Agent Training', where it is currently the sole paper. This positioning suggests the paper addresses a relatively sparse research direction within the broader offensive security landscape, which includes more populated areas like runtime-based penetration testing with multiple sibling approaches.
The taxonomy reveals that Cyber-Zero's nearest neighbors are runtime-based methods in sibling leaves: 'Sequence Modeling for Penetration Testing' (Pentraformer) and 'Reasoning-Optimized Penetration Testing' (Pentest-R1). These approaches require executable environments or simulators, whereas Cyber-Zero explicitly avoids runtime interaction. The broader 'Offensive Security Agent Training' branch also includes 'AI System Red-Teaming', which targets AI safety vulnerabilities rather than traditional network penetration. The defensive counterpart branch ('Offline RL for Cybersecurity Defense') addresses policy learning from historical logs but focuses on protection rather than offensive trajectory synthesis.
Among 21 candidates examined, none clearly refute the three core contributions. The CYBER-ZERO framework itself was compared against 1 candidate with no refutation found. The synthesized trajectory dataset and ENIGMA+ agent scaffold each faced 10 candidates, with all classified as non-refutable or unclear. This limited search scope—covering top-K semantic matches and citation expansion—suggests that within the examined literature, no prior work directly overlaps with the combination of runtime-free synthesis, persona-driven simulation, and CTF writeup exploitation for cybersecurity agent training.
Based on the 21-candidate search, the work appears to occupy a novel position at the intersection of trajectory synthesis and offensive security training. However, the analysis does not cover exhaustive domain-specific venues or gray literature in cybersecurity competitions. The taxonomy structure indicates this is an emerging direction with sparse prior work, though the limited search scope means additional related efforts in specialized CTF or security conferences may exist beyond the examined set.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors present CYBER-ZERO, a novel framework that synthesizes high-quality agent trajectories for training cybersecurity LLMs without requiring access to executable runtime environments. It uses persona-driven LLM simulation with dual models (CTF Player and Bash Terminal) to reverse-engineer behaviors from public CTF writeups and generate realistic multi-turn interaction sequences.
The authors build a dataset of 6,188 high-quality CTF writeups spanning 4,610 unique challenges from 543 competitions across six task categories. These synthesized trajectories enable training of LLM agents for vulnerability discovery and exploitation tasks without requiring runtime environments.
The authors develop ENIGMA+, an enhanced version of the ENIGMA scaffold that executes evaluation tasks in parallel rather than sequentially. This improvement dramatically reduces evaluation time from 1-3 days to under 5 hours for 300+ CTF challenges while maintaining evaluation quality.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
CYBER-ZERO runtime-free trajectory synthesis framework
The authors present CYBER-ZERO, a novel framework that synthesizes high-quality agent trajectories for training cybersecurity LLMs without requiring access to executable runtime environments. It uses persona-driven LLM simulation with dual models (CTF Player and Bash Terminal) to reverse-engineer behaviors from public CTF writeups and generate realistic multi-turn interaction sequences.
[32] Generative AI for Simulating Real World Dynamics Applications and Challenges PDF
Large-scale synthesized cybersecurity trajectory dataset
The authors build a dataset of 6,188 high-quality CTF writeups spanning 4,610 unique challenges from 543 competitions across six task categories. These synthesized trajectories enable training of LLM agents for vulnerability discovery and exploitation tasks without requiring runtime environments.
[22] Generative AI-Enhanced Cybersecurity Framework for Enterprise Data Privacy Management PDF
[23] Approach to Forming Vulnerability Datasets for Fine-Tuning AI Agents PDF
[24] AI-enabled Cybersecurity using Synthetic Data PDF
[25] An Ensemble Transformer Approach with Cross-Attention for Automated Code Security Vulnerability Detection and Documentation PDF
[26] Leveraging gans for synthetic data generation to improve intrusion detection systems PDF
[27] A novel deep synthesis-based insider intrusion detection (DS-IID) model for malicious insiders and AI-generated threats PDF
[28] Evaluating Biased Synthetic Data Effects on Large Language Model-Based Software Vulnerability Detection PDF
[29] DeepBalance: Deep-Learning and Fuzzy Oversampling for Vulnerability Detection PDF
[30] Enhancing Large Language Models for Secure Code Generation: A Dataset-driven Study on Vulnerability Mitigation PDF
[31] A Multimodal Framework for Advanced Cybersecurity Threat Detection Using GAN-Driven Data Synthesis PDF
ENIGMA+ agent scaffold with improved efficiency
The authors develop ENIGMA+, an enhanced version of the ENIGMA scaffold that executes evaluation tasks in parallel rather than sequentially. This improvement dramatically reduces evaluation time from 1-3 days to under 5 hours for 300+ CTF challenges while maintaining evaluation quality.