Speculative Actions: A Lossless Framework for Faster AI Agents

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.5 Download Report PDF

AI AgentsSpeculative DecodingParallel ExecutionAgentic ServingAgentic Simulation

AI agents have attracted growing interest across industry and academia, but in practice their execution can be slow. For example, letting two state-of-the-art agents play a game of chess may take hours. A key bottleneck is that agent behavior unfolds sequentially: each action requires an API call, and these calls can be time-consuming. Inspired by speculative execution in microprocessors and speculative decoding in LLM inference, we propose speculative actions—a lossless framework that predicts likely actions using faster models, enabling multiple API calls to be executed in parallel. We evaluate this framework across four agentic environments: gaming, e-commerce, web search, and operating systems. In all cases, speculative actions yield substantial acceleration, with potential speedups of up to 30%. Moreover, performance can be further improved through stronger guessing models and top-K action prediction, opening a promising path toward real world, efficient deployment of AI agents.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a speculative actions framework that predicts likely agent actions using faster models to enable parallel API execution, drawing inspiration from speculative decoding in LLM inference. Within the taxonomy, it resides in the 'Speculative Action Prediction in General Agentic Systems' leaf, which contains only two papers total. This leaf sits under the broader 'Speculative Execution Frameworks for Agent Acceleration' branch, which encompasses five specialized subcategories addressing VLA models, LLM-based planning, ranking systems, and edge devices. The sparse population of this particular leaf suggests the work targets a relatively nascent research direction focused on domain-agnostic speculation mechanisms rather than task-specific predictive architectures.

The taxonomy reveals neighboring branches that explore related but distinct approaches to agent acceleration and prediction. Adjacent leaves include 'Speculative Planning for LLM-Based Agents' (focusing on planning latency reduction through co-design) and 'Speculative Decoding for Vision-Language-Action Models' (applying drafting-verification to VLA inference). The broader 'Predictive Models for Agent Behavior' branch contains trajectory forecasting and action prediction methods that emphasize learning from interaction traces rather than parallelization mechanisms. The paper's position bridges general-purpose speculation frameworks and domain-specific applications, with the taxonomy explicitly excluding prediction models without parallelization from this leaf while directing domain-specific implementations to other subcategories.

Among the three contributions analyzed across thirty candidate papers, the core speculative actions framework shows one refutable candidate among ten examined, indicating some prior overlap in the limited search scope. The unified API-call abstraction and multi-environment demonstration contributions each examined ten candidates with zero refutations, suggesting these aspects may be more distinctive within the search scope. The statistics reflect a focused literature search rather than exhaustive coverage, with the single refutable pair likely representing work in the same sparse research direction. The framework's generality across gaming, e-commerce, web search, and operating systems appears less directly addressed in the examined candidates, though the limited sample size constrains definitive conclusions.

Based on the examined thirty candidates, the work appears to occupy a relatively unexplored intersection of general-purpose speculation and multi-domain agentic systems. The sparse taxonomy leaf and limited refutations suggest novelty within the search scope, though the presence of one overlapping candidate indicates the core speculation concept has precedent. The analysis captures top-K semantic matches and does not exhaustively cover all related work in agent acceleration, particularly in specialized domains like robotics or autonomous vehicles where prediction mechanisms may differ substantially from the proposed API-parallel framework.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Accelerating agentic systems through speculative action prediction. The field addresses how autonomous agents—ranging from web-based assistants to robotic controllers—can reduce latency and improve responsiveness by predicting and pre-executing likely future actions. The taxonomy organizes this landscape into several major branches: Speculative Execution Frameworks for Agent Acceleration develop general-purpose mechanisms for drafting and verifying candidate actions (e.g., Speculative Actions Framework[10], Reinforcement Speculative Decoding[7]); Predictive Models for Agent Behavior and Trajectory Forecasting focus on forecasting multi-step sequences in navigation and driving contexts (e.g., MTR Plus Plus[2], Precog[5]); GUI and Web Automation Agents with Efficient Action Understanding tackle screen-based tasks where predicting user or agent clicks can streamline interaction (e.g., ScreenLLM[4], Predicting Future Actions[3]); and branches on Autonomous Navigation and Robotics, Multi-Agent Coordination, Theoretical Foundations, Edge Computing, Anticipatory Behavior, and Domain-Specific Applications each explore how prediction and speculation manifest in their respective settings—from robot motion planning (Spec VLA[6]) to distributed edge intelligence (Edge General Intelligence[13]) and cooperative control (Event Triggered Cooperative[27]). A particularly active line of work centers on general speculative frameworks that borrow ideas from language-model speculative decoding and adapt them to action spaces, aiming to balance the cost of generating multiple candidate actions against the speedup from parallel verification. Speculative Actions[0] sits squarely in this branch, proposing mechanisms to draft and validate action sequences in agentic systems, closely aligned with Speculative Actions Framework[10] and Reinforcement Speculative Decoding[7], which similarly explore how to leverage smaller or faster models to propose actions that a larger policy then confirms. In contrast, works like Predicting Future Actions[3] and ScreenLLM[4] emphasize learning predictive models from interaction traces in GUI environments, while Precog[5] and MTR Plus Plus[2] focus on trajectory forecasting for autonomous vehicles, highlighting a trade-off between domain-agnostic speculation frameworks and task-specific predictive architectures. Open questions remain around how to ensure safety during speculative execution (Safety Assured Speculative[12]), how to handle multi-agent scenarios where predictions must account for other agents' behaviors (Mutual Prediction[41]), and how to deploy these techniques efficiently at the edge (Edge General Intelligence[13]).

Claimed Contributions

Speculative actions framework for agentic systems

Can Refute

9 retrieved papers

The authors introduce a general framework that allows agents to predict and tentatively pursue the most likely next actions using faster models while slower ground-truth executors catch up. This framework treats each action in an agentic system as an API call and uses a Speculator to predict responses in parallel with an Actor that provides authoritative outputs, achieving lossless speedup through validation and rollback mechanisms.

9 retrieved papers

Can Refute

Unified API-call abstraction for agentic environments

10 retrieved papers

The authors propose modeling every action in an agentic system (LLM calls, tool invocations, MCP server requests, and human responses) as an API call. This abstraction provides a unified framework for optimizing system latency and aligns with the emerging environment and MCP perspectives on agentic systems.

10 retrieved papers

Demonstration across multiple agentic environments

10 retrieved papers

The authors instantiate and evaluate their speculative actions framework across four diverse environments (chess gameplay, e-commerce dialogue, multi-hop web search, and OS hyperparameter tuning), demonstrating substantial accuracy in next-action prediction and significant reductions in end-to-end latency across different types of agent-environment interactions.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Speculative actions framework for agentic systems

[73] Dynamic speculative agent planning PDF

Can Refute

[6] Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance PDF

Cannot Refute

[9] Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface PDF

Cannot Refute

[20] Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design PDF

Cannot Refute

[70] JANUS: A Simple and Efficient Speculative Defense using Reinforcement Learning PDF

Cannot Refute

[71] Comparing Speculative Synchronization Algorithms for Continuous-Time Agent-Based Simulations PDF

Cannot Refute

[72] Deploying foundation model powered agent services: A survey PDF

Cannot Refute

[74] Scaling Test-time Compute in Mobile GUI Agents with Parallel Speculative Execution PDF

Cannot Refute

[75] MVVM: Deploy Your AI Agents-Securely, Efficiently, Everywhere PDF

Cannot Refute

Contribution

Unified API-call abstraction for agentic environments

[60] Beyond Formal Semantics for Capabilities and Skills: Model Context Protocol in Manufacturing PDF

Cannot Refute

[61] A comprehensive survey of self-evolving ai agents: A new paradigm bridging foundation models and lifelong agentic systems PDF

Cannot Refute

[62] Hands-Free: Action Abstraction With Hierarchical Reinforcement Learning in Text-Based Games PDF

Cannot Refute

[63] xlam: A family of large action models to empower ai agent systems PDF

Cannot Refute

[64] NetMind+: Adaptive Baseband Function Placement With GCN Encoding and Incremental Maze-Solving DRL for Dynamic and Heterogeneous RANs PDF

Cannot Refute

[65] TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning PDF

Cannot Refute

[66] Signifiers as a First-class Abstraction in Hypermedia Multi-Agent Systems PDF

Cannot Refute

[67] Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation PDF

Cannot Refute

[68] Modelscope-agent: Building your customizable agent system with open-source large language models PDF

Cannot Refute

[69] VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought PDF

Cannot Refute

Contribution

Demonstration across multiple agentic environments

[50] Agent q: Advanced reasoning and learning for autonomous ai agents PDF

Cannot Refute

[51] What is your ai agent buying? evaluation, implications and emerging questions for agentic e-commerce PDF

Cannot Refute

[52] Webarena: A realistic web environment for building autonomous agents PDF

Cannot Refute

[53] The automated but risky game: Modeling agent-to-agent negotiations and transactions in consumer markets PDF

Cannot Refute

[54] Automated game testing with online search agent and model construction, a study PDF

Cannot Refute

[55] A Functionality-Grounded Benchmark for Evaluating Web Agents in E-commerce Domains PDF

Cannot Refute

[56] A Research Landscape of Agentic AI and Large Language Models: Applications, Challenges and Future Directions PDF

Cannot Refute

[57] A shopping agent for addressing subjective product needs PDF

Cannot Refute

[58] X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System PDF

Cannot Refute

[59] The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners PDF

Cannot Refute

Speculative Actions: A Lossless Framework for Faster AI Agents

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Speculative actions framework for agentic systems

[73] Dynamic speculative agent planning PDF

[6] Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance PDF

[9] Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface PDF

[20] Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-Design PDF

[70] JANUS: A Simple and Efficient Speculative Defense using Reinforcement Learning PDF

[71] Comparing Speculative Synchronization Algorithms for Continuous-Time Agent-Based Simulations PDF

[72] Deploying foundation model powered agent services: A survey PDF

[74] Scaling Test-time Compute in Mobile GUI Agents with Parallel Speculative Execution PDF

[75] MVVM: Deploy Your AI Agents-Securely, Efficiently, Everywhere PDF

Unified API-call abstraction for agentic environments

[60] Beyond Formal Semantics for Capabilities and Skills: Model Context Protocol in Manufacturing PDF

[61] A comprehensive survey of self-evolving ai agents: A new paradigm bridging foundation models and lifelong agentic systems PDF

[62] Hands-Free: Action Abstraction With Hierarchical Reinforcement Learning in Text-Based Games PDF

[63] xlam: A family of large action models to empower ai agent systems PDF

[64] NetMind+: Adaptive Baseband Function Placement With GCN Encoding and Incremental Maze-Solving DRL for Dynamic and Heterogeneous RANs PDF

[65] TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning PDF

[66] Signifiers as a First-class Abstraction in Hypermedia Multi-Agent Systems PDF

[67] Guiding VLM Agents with Process Rewards at Inference Time for GUI Navigation PDF

[68] Modelscope-agent: Building your customizable agent system with open-source large language models PDF

[69] VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought PDF

Demonstration across multiple agentic environments

[50] Agent q: Advanced reasoning and learning for autonomous ai agents PDF

[51] What is your ai agent buying? evaluation, implications and emerging questions for agentic e-commerce PDF

[52] Webarena: A realistic web environment for building autonomous agents PDF

[53] The automated but risky game: Modeling agent-to-agent negotiations and transactions in consumer markets PDF

[54] Automated game testing with online search agent and model construction, a study PDF

[55] A Functionality-Grounded Benchmark for Evaluating Web Agents in E-commerce Domains PDF

[56] A Research Landscape of Agentic AI and Large Language Models: Applications, Challenges and Future Directions PDF

[57] A shopping agent for addressing subjective product needs PDF

[58] X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System PDF

[59] The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners PDF

Table of Contents