Kimi-Dev: Agentless Training as Skill Prior for SWE-agents

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 7.0 Download Report PDF

coder LLMAgentlessSWE-AgentReinforcement Learning

Large Language Models (LLMs) are increasingly applied to software engineering (SWE), with SWE-bench as a key benchmark. Solutions are split into SWE-Agent frameworks with multi-turn interactions and workflow-based Agentless methods with single-turn verifiable steps. We argue these paradigms are not mutually exclusive: reasoning-intensive Agentless training induces skill priors, including localization, code edit, and self-reflection that enable efficient and effective SWE-Agent adaptation. In this work, we first curate the Agentless training recipe and present Kimi-Dev, an open-source SWE LLM achieving 60.4% on SWE-bench Verified, the best among workflow approaches. With additional SFT adaptation on 5k publicly-available trajectories, Kimi-Dev powers SWE-Agents to 48.6% pass@1, on par with that of Claude 3.5 Sonnet (241022 version). These results show that structured skill priors from Agentless training can bridge workflow and agentic frameworks for transferable coding agents.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a training paradigm that combines workflow-based 'Agentless' methods with multi-turn 'SWE-Agent' frameworks, introducing Kimi-Dev as an open-source model achieving strong performance on SWE-bench Verified. Within the taxonomy, this work occupies the 'Workflow-to-Agent Transfer Learning' leaf under 'Agentless Skill Prior Training for Code Generation Agents'. Notably, this leaf contains only the original paper itself, with no sibling papers identified, suggesting this specific formulation of bridging agentless and agentic paradigms represents a relatively sparse or emerging research direction within the broader software engineering agent training landscape.

The taxonomy reveals two main branches: 'Agentless Skill Prior Training' (focused on extracting reusable knowledge from non-agentic workflows) and 'Agent-Based Software Development Simulation' (emphasizing multi-agent environments for studying emergent behaviors). The original paper aligns with the first branch, specifically targeting transfer learning from structured workflows to agent policies. The neighboring 'Team Task Allocation Simulation' leaf addresses complementary concerns around modeling team dynamics rather than individual agent skill acquisition. This positioning indicates the work diverges from simulation-driven approaches by prioritizing efficient skill extraction from existing code artifacts over ecological validity of multi-agent interactions.

Among the three identified contributions, the literature search examined twenty candidates total. The 'Agentless training recipe' and 'Efficient SWE-Agent adaptation' contributions each had ten candidates examined, with zero refutable pairs found for either. The 'Skill priors framework' contribution had no candidates examined in the provided analysis. Given this limited search scope—twenty papers from semantic search and citation expansion—the absence of clearly overlapping prior work suggests these specific formulations may be novel within the examined sample, though the search scale does not permit definitive claims about the broader literature landscape.

Based on the constrained search of twenty candidates and the sparse taxonomy leaf (no sibling papers), the work appears to occupy a relatively unexplored intersection of workflow-based and agentic training paradigms. However, the analysis explicitly acknowledges its limited scope, examining top-K semantic matches rather than exhaustive coverage. The absence of refutable candidates may reflect either genuine novelty in this specific formulation or gaps in the search strategy's coverage of related work in adjacent areas such as reinforcement learning from demonstrations or curriculum learning for code generation.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: training software engineering agents through agentless skill priors. The field structure suggested by this taxonomy divides into two main branches. The first, Agentless Skill Prior Training for Code Generation Agents, focuses on methods that extract reusable knowledge or behavioral patterns from non-agentic workflows—such as static code analysis, compiler feedback, or human demonstrations—and then transfer these priors to train autonomous coding agents. The second branch, Agent-Based Software Development Simulation, emphasizes creating realistic multi-agent environments where software engineering processes (collaboration, debugging, version control) are simulated to study emergent behaviors or to generate training data. These branches reflect complementary perspectives: one distills skill from existing artifacts, while the other builds synthetic ecosystems to observe and learn from agent interactions. Within the first branch, a particularly active line of work explores workflow-to-agent transfer learning, where structured pipelines or tool-use traces are converted into agent policies. Kimi-Dev[0] exemplifies this approach by leveraging agentless skill priors to bootstrap agent training, sidestepping the need for expensive human-labeled trajectories. In contrast, the second branch includes efforts like Truck Factor Simulation[1], which models team dynamics and knowledge distribution to understand how agents might handle real-world software maintenance scenarios. The key trade-off across these directions is between the scalability of distilling priors from static or workflow data versus the ecological validity of learning from simulated multi-agent interactions. Kimi-Dev[0] sits squarely in the transfer-learning cluster, emphasizing efficient skill extraction over simulation fidelity, and thus complements simulation-driven methods by offering a more direct path from existing code artifacts to agent capabilities.

Claimed Contributions

Agentless training recipe for SWE tasks

10 retrieved papers

The authors develop a comprehensive Agentless training recipe consisting of mid-training, cold-start supervised finetuning, reinforcement learning with outcome-based rewards, and test-time self-play. This recipe produces Kimi-Dev, which achieves state-of-the-art performance among workflow-based methods on SWE-bench Verified.

10 retrieved papers

Skill priors framework bridging Agentless and agentic paradigms

0 retrieved papers

The authors propose a novel perspective that Agentless training induces transferable skill priors (localization, code editing, self-reflection) which enable efficient adaptation to multi-turn SWE-Agent frameworks. This challenges the dichotomy between workflow-based and agentic approaches by treating them as complementary training stages.

0 retrieved papers

Efficient SWE-Agent adaptation from Agentless priors

10 retrieved papers

The authors demonstrate that minimal supervised finetuning (5k trajectories) on top of Agentless-trained Kimi-Dev enables competitive SWE-Agent performance. This shows that structured skill priors from Agentless training transfer effectively to end-to-end agentic frameworks with significantly reduced data requirements.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Agentless training recipe for SWE tasks

[2] Kimi k1. 5: Scaling reinforcement learning with llms PDF

Cannot Refute

[3] A comparison of reinforcement learning frameworks for software testing tasks PDF

Cannot Refute

[4] Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning PDF

Cannot Refute

[5] Swe-rl: Advancing llm reasoning via reinforcement learning on open software evolution PDF

Cannot Refute

[6] Open-reasoner-zero: An open source approach to scaling up reinforcement learning on the base model PDF

Cannot Refute

[7] Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards PDF

Cannot Refute

[8] Kimi k2: Open agentic intelligence PDF

Cannot Refute

[9] RLocator: Reinforcement learning for bug localization PDF

Cannot Refute

[10] Reinforcement learning for test case prioritization PDF

Cannot Refute

[11] The role of Reinforcement Learning in software testing PDF

Cannot Refute

Contribution

Skill priors framework bridging Agentless and agentic paradigms

Contribution

Efficient SWE-Agent adaptation from Agentless priors

[12] Aflow: Automating agentic workflow generation PDF

Cannot Refute

[13] Ai agents vs. agentic ai: A conceptual taxonomy, applications and challenges PDF

Cannot Refute

[14] AI Agentic workflows and Enterprise APIs: Adapting API architectures for the age of AI agents PDF

Cannot Refute

[15] A survey on agent workflowâstatus and future PDF

Cannot Refute

[16] Agentmesh: A cooperative multi-agent generative ai framework for software development automation PDF

Cannot Refute

[17] Automating Agile Workflows: The Role of Multi-Agent LLMs in Modern Software Engineering PDF

Cannot Refute

[18] Towards modeling human-agentic collaborative workflows: A bpmn extension PDF

Cannot Refute

[19] Evoagentx: An automated framework for evolving agentic workflows PDF

Cannot Refute

[20] The first twenty years of agent-based software development with JADE PDF

Cannot Refute

[21] MASFlow: Multi-Agent Based Service Workflow Generation PDF

Cannot Refute

Kimi-Dev: Agentless Training as Skill Prior for SWE-agents

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

Contribution Analysis

Agentless training recipe for SWE tasks

[2] Kimi k1. 5: Scaling reinforcement learning with llms PDF

[3] A comparison of reinforcement learning frameworks for software testing tasks PDF

[4] Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning PDF

[5] Swe-rl: Advancing llm reasoning via reinforcement learning on open software evolution PDF

[6] Open-reasoner-zero: An open source approach to scaling up reinforcement learning on the base model PDF

[7] Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards PDF

[8] Kimi k2: Open agentic intelligence PDF

[9] RLocator: Reinforcement learning for bug localization PDF

[10] Reinforcement learning for test case prioritization PDF

[11] The role of Reinforcement Learning in software testing PDF

Skill priors framework bridging Agentless and agentic paradigms

Efficient SWE-Agent adaptation from Agentless priors

[12] Aflow: Automating agentic workflow generation PDF

[13] Ai agents vs. agentic ai: A conceptual taxonomy, applications and challenges PDF

[14] AI Agentic workflows and Enterprise APIs: Adapting API architectures for the age of AI agents PDF

[15] A survey on agent workflowâstatus and future PDF

[16] Agentmesh: A cooperative multi-agent generative ai framework for software development automation PDF

[17] Automating Agile Workflows: The Role of Multi-Agent LLMs in Modern Software Engineering PDF

[18] Towards modeling human-agentic collaborative workflows: A bpmn extension PDF

[19] Evoagentx: An automated framework for evolving agentic workflows PDF

[20] The first twenty years of agent-based software development with JADE PDF

[21] MASFlow: Multi-Agent Based Service Workflow Generation PDF

Table of Contents

[15] A survey on agent workflowâstatus and future PDF