RedTeamCUA: Realistic Adversarial Testing of Computer-Use Agents in Hybrid Web-OS Environments
Overview
Overall Novelty Assessment
The paper introduces RedTeamCUA, a framework for adversarial testing of computer-use agents against indirect prompt injection, and RTC-Bench, a benchmark with 864 examples. It resides in the 'Comprehensive Multi-Environment Testing Frameworks' leaf alongside two sibling papers. This leaf sits within the broader 'Adversarial Testing Frameworks and Benchmarks' branch, which itself is one of five major research directions in the field. The taxonomy contains 50 papers total, indicating a moderately active research area with multiple specialized sub-communities.
The paper's leaf neighbors include 'Specialized Domain Benchmarks' (four papers targeting specific applications like tool-integrated agents) and 'Black-Box Fuzzing and Automated Discovery' (four papers on automated vulnerability discovery). The broader taxonomy reveals parallel efforts in attack technique development, defense mechanisms, threat modeling, and empirical evaluations. The scope note for the paper's leaf emphasizes 'hybrid environments (web-OS, GUI, multi-modal)' and 'realistic scenario configuration,' distinguishing it from single-environment benchmarks. This positioning suggests the work aims to bridge gaps between isolated testing paradigms.
Among 25 candidates examined across three contributions, no clearly refuting prior work was identified. The hybrid sandbox contribution examined 10 candidates with zero refutations; the benchmark contribution similarly examined 10 with none refuting; the decoupled evaluation setting examined 5 with zero refutations. This suggests that within the limited search scope—focused on top semantic matches and citation expansion—the specific combination of hybrid web-OS sandboxing, decoupled evaluation, and comprehensive adversarial scenarios appears less directly overlapping with existing frameworks. However, the search scale (25 candidates, not hundreds) means unexplored literature may contain relevant comparisons.
Based on the limited literature search, the work appears to occupy a distinct position within comprehensive testing frameworks, particularly in its hybrid web-OS integration and decoupled evaluation design. The absence of refuting candidates among 25 examined suggests novelty in the specific technical approach, though the broader research direction (multi-environment adversarial testing) is clearly established with multiple active efforts. The analysis covers top semantic matches and immediate citations but does not claim exhaustive field coverage.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce REDTEAMCUA, a framework that combines a VM-based operating system environment with Docker-based web platforms to enable realistic and controlled adversarial testing of computer-use agents across both web and OS interfaces. The hybrid sandbox supports configurable adversarial scenario injection and a decoupled evaluation setting that separates adversarial robustness testing from navigational capability limitations.
The authors construct RTC-BENCH, a benchmark comprising 864 test examples designed to evaluate CUA vulnerabilities to indirect prompt injection. The benchmark systematically explores hybrid web-OS attack pathways by coupling 9 benign goals with 24 adversarial goals based on the CIA security framework, with variations in instruction specificity and injection content type.
The authors introduce a Decoupled Eval setting that uses pre-processed actions to place agents directly at the adversarial injection site, isolating adversarial robustness assessment from navigation limitations. This enables focused analysis of CUA vulnerabilities when directly exposed to malicious content, independent of the agent's ability to navigate to the injection point.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[3] EVA: Red-Teaming GUI Agents via Evolving Indirect Prompt Injection PDF
[8] Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
REDTEAMCUA adversarial testing framework with hybrid sandbox
The authors introduce REDTEAMCUA, a framework that combines a VM-based operating system environment with Docker-based web platforms to enable realistic and controlled adversarial testing of computer-use agents across both web and OS interfaces. The hybrid sandbox supports configurable adversarial scenario injection and a decoupled evaluation setting that separates adversarial robustness testing from navigational capability limitations.
[25] A systematization of security vulnerabilities in computer use agents PDF
[35] The Commercial Landscape of Agentic AI Security PDF
[58] Coral: Container online risk assessment with logical attack graphs PDF
[59] EVMFuzz: Differential fuzz testing of Ethereum virtual machine PDF
[60] Secure software development and testing: A model-based methodology PDF
[61] AI-Optimized Network Function Virtualization Security in Cloud Infrastructure PDF
[62] Laccolith: Hypervisor-based adversary emulation with anti-detection PDF
[63] A Review of TRiSM Frameworks in Artificial Intelligence Systems: Fundamentals, Taxonomy, Use Cases, Key Challenges and Future Directions PDF
[64] Construction and Evaluation of LLM-based agents for Semi-Autonomous penetration testing PDF
[65] Torpedo: A Fuzzing Framework for Discovering Adversarial Container Workloads PDF
RTC-BENCH comprehensive adversarial benchmark
The authors construct RTC-BENCH, a benchmark comprising 864 test examples designed to evaluate CUA vulnerabilities to indirect prompt injection. The benchmark systematically explores hybrid web-OS attack pathways by coupling 9 benign goals with 24 adversarial goals based on the CIA security framework, with variations in instruction specificity and injection content type.
[5] AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents PDF
[14] InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents PDF
[15] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents PDF
[51] VPI-Bench: Visual Prompt Injection Attacks for Computer-Use Agents PDF
[52] [Short Paper] Forensic Analysis of Indirect Prompt Injection Attacks on LLM Agents PDF
[53] Agentharm: A benchmark for measuring harmfulness of llm agents PDF
[54] {SelfDefend}:{LLMs} can defend themselves against jailbreaking in a practical manner PDF
[55] Llamafirewall: An open source guardrail system for building secure ai agents PDF
[56] OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents PDF
[57] Jailjudge: A comprehensive jailbreak judge benchmark with multi-agent enhanced explanation evaluation framework PDF
Decoupled evaluation setting for focused vulnerability analysis
The authors introduce a Decoupled Eval setting that uses pre-processed actions to place agents directly at the adversarial injection site, isolating adversarial robustness assessment from navigation limitations. This enables focused analysis of CUA vulnerabilities when directly exposed to malicious content, independent of the agent's ability to navigate to the injection point.