Kimi-Dev: Agentless Training as Skill Prior for SWE-agents

ICLR 2026 Conference SubmissionAnonymous Authors
coder LLMAgentlessSWE-AgentReinforcement Learning
Abstract:

Large Language Models (LLMs) are increasingly applied to software engineering (SWE), with SWE-bench as a key benchmark. Solutions are split into SWE-Agent frameworks with multi-turn interactions and workflow-based Agentless methods with single-turn verifiable steps. We argue these paradigms are not mutually exclusive: reasoning-intensive Agentless training induces skill priors, including localization, code edit, and self-reflection that enable efficient and effective SWE-Agent adaptation. In this work, we first curate the Agentless training recipe and present Kimi-Dev, an open-source SWE LLM achieving 60.4% on SWE-bench Verified, the best among workflow approaches. With additional SFT adaptation on 5k publicly-available trajectories, Kimi-Dev powers SWE-Agents to 48.6% pass@1, on par with that of Claude 3.5 Sonnet (241022 version). These results show that structured skill priors from Agentless training can bridge workflow and agentic frameworks for transferable coding agents.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a training paradigm that combines workflow-based 'Agentless' methods with multi-turn 'SWE-Agent' frameworks, introducing Kimi-Dev as an open-source model achieving strong performance on SWE-bench Verified. Within the taxonomy, this work occupies the 'Workflow-to-Agent Transfer Learning' leaf under 'Agentless Skill Prior Training for Code Generation Agents'. Notably, this leaf contains only the original paper itself, with no sibling papers identified, suggesting this specific formulation of bridging agentless and agentic paradigms represents a relatively sparse or emerging research direction within the broader software engineering agent training landscape.

The taxonomy reveals two main branches: 'Agentless Skill Prior Training' (focused on extracting reusable knowledge from non-agentic workflows) and 'Agent-Based Software Development Simulation' (emphasizing multi-agent environments for studying emergent behaviors). The original paper aligns with the first branch, specifically targeting transfer learning from structured workflows to agent policies. The neighboring 'Team Task Allocation Simulation' leaf addresses complementary concerns around modeling team dynamics rather than individual agent skill acquisition. This positioning indicates the work diverges from simulation-driven approaches by prioritizing efficient skill extraction from existing code artifacts over ecological validity of multi-agent interactions.

Among the three identified contributions, the literature search examined twenty candidates total. The 'Agentless training recipe' and 'Efficient SWE-Agent adaptation' contributions each had ten candidates examined, with zero refutable pairs found for either. The 'Skill priors framework' contribution had no candidates examined in the provided analysis. Given this limited search scope—twenty papers from semantic search and citation expansion—the absence of clearly overlapping prior work suggests these specific formulations may be novel within the examined sample, though the search scale does not permit definitive claims about the broader literature landscape.

Based on the constrained search of twenty candidates and the sparse taxonomy leaf (no sibling papers), the work appears to occupy a relatively unexplored intersection of workflow-based and agentic training paradigms. However, the analysis explicitly acknowledges its limited scope, examining top-K semantic matches rather than exhaustive coverage. The absence of refutable candidates may reflect either genuine novelty in this specific formulation or gaps in the search strategy's coverage of related work in adjacent areas such as reinforcement learning from demonstrations or curriculum learning for code generation.

Taxonomy

Core-task Taxonomy Papers
1
3
Claimed Contributions
20
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: training software engineering agents through agentless skill priors. The field structure suggested by this taxonomy divides into two main branches. The first, Agentless Skill Prior Training for Code Generation Agents, focuses on methods that extract reusable knowledge or behavioral patterns from non-agentic workflows—such as static code analysis, compiler feedback, or human demonstrations—and then transfer these priors to train autonomous coding agents. The second branch, Agent-Based Software Development Simulation, emphasizes creating realistic multi-agent environments where software engineering processes (collaboration, debugging, version control) are simulated to study emergent behaviors or to generate training data. These branches reflect complementary perspectives: one distills skill from existing artifacts, while the other builds synthetic ecosystems to observe and learn from agent interactions. Within the first branch, a particularly active line of work explores workflow-to-agent transfer learning, where structured pipelines or tool-use traces are converted into agent policies. Kimi-Dev[0] exemplifies this approach by leveraging agentless skill priors to bootstrap agent training, sidestepping the need for expensive human-labeled trajectories. In contrast, the second branch includes efforts like Truck Factor Simulation[1], which models team dynamics and knowledge distribution to understand how agents might handle real-world software maintenance scenarios. The key trade-off across these directions is between the scalability of distilling priors from static or workflow data versus the ecological validity of learning from simulated multi-agent interactions. Kimi-Dev[0] sits squarely in the transfer-learning cluster, emphasizing efficient skill extraction over simulation fidelity, and thus complements simulation-driven methods by offering a more direct path from existing code artifacts to agent capabilities.

Claimed Contributions

Agentless training recipe for SWE tasks

The authors develop a comprehensive Agentless training recipe consisting of mid-training, cold-start supervised finetuning, reinforcement learning with outcome-based rewards, and test-time self-play. This recipe produces Kimi-Dev, which achieves state-of-the-art performance among workflow-based methods on SWE-bench Verified.

10 retrieved papers
Skill priors framework bridging Agentless and agentic paradigms

The authors propose a novel perspective that Agentless training induces transferable skill priors (localization, code editing, self-reflection) which enable efficient adaptation to multi-turn SWE-Agent frameworks. This challenges the dichotomy between workflow-based and agentic approaches by treating them as complementary training stages.

0 retrieved papers
Efficient SWE-Agent adaptation from Agentless priors

The authors demonstrate that minimal supervised finetuning (5k trajectories) on top of Agentless-trained Kimi-Dev enables competitive SWE-Agent performance. This shows that structured skill priors from Agentless training transfer effectively to end-to-end agentic frameworks with significantly reduced data requirements.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Agentless training recipe for SWE tasks

The authors develop a comprehensive Agentless training recipe consisting of mid-training, cold-start supervised finetuning, reinforcement learning with outcome-based rewards, and test-time self-play. This recipe produces Kimi-Dev, which achieves state-of-the-art performance among workflow-based methods on SWE-bench Verified.

Contribution

Skill priors framework bridging Agentless and agentic paradigms

The authors propose a novel perspective that Agentless training induces transferable skill priors (localization, code editing, self-reflection) which enable efficient adaptation to multi-turn SWE-Agent frameworks. This challenges the dichotomy between workflow-based and agentic approaches by treating them as complementary training stages.

Contribution

Efficient SWE-Agent adaptation from Agentless priors

The authors demonstrate that minimal supervised finetuning (5k trajectories) on top of Agentless-trained Kimi-Dev enables competitive SWE-Agent performance. This shows that structured skill priors from Agentless training transfer effectively to end-to-end agentic frameworks with significantly reduced data requirements.