Kimi-Dev: Agentless Training as Skill Prior for SWE-agents
Overview
Overall Novelty Assessment
The paper proposes a training paradigm that combines workflow-based 'Agentless' methods with multi-turn 'SWE-Agent' frameworks, introducing Kimi-Dev as an open-source model achieving strong performance on SWE-bench Verified. Within the taxonomy, this work occupies the 'Workflow-to-Agent Transfer Learning' leaf under 'Agentless Skill Prior Training for Code Generation Agents'. Notably, this leaf contains only the original paper itself, with no sibling papers identified, suggesting this specific formulation of bridging agentless and agentic paradigms represents a relatively sparse or emerging research direction within the broader software engineering agent training landscape.
The taxonomy reveals two main branches: 'Agentless Skill Prior Training' (focused on extracting reusable knowledge from non-agentic workflows) and 'Agent-Based Software Development Simulation' (emphasizing multi-agent environments for studying emergent behaviors). The original paper aligns with the first branch, specifically targeting transfer learning from structured workflows to agent policies. The neighboring 'Team Task Allocation Simulation' leaf addresses complementary concerns around modeling team dynamics rather than individual agent skill acquisition. This positioning indicates the work diverges from simulation-driven approaches by prioritizing efficient skill extraction from existing code artifacts over ecological validity of multi-agent interactions.
Among the three identified contributions, the literature search examined twenty candidates total. The 'Agentless training recipe' and 'Efficient SWE-Agent adaptation' contributions each had ten candidates examined, with zero refutable pairs found for either. The 'Skill priors framework' contribution had no candidates examined in the provided analysis. Given this limited search scope—twenty papers from semantic search and citation expansion—the absence of clearly overlapping prior work suggests these specific formulations may be novel within the examined sample, though the search scale does not permit definitive claims about the broader literature landscape.
Based on the constrained search of twenty candidates and the sparse taxonomy leaf (no sibling papers), the work appears to occupy a relatively unexplored intersection of workflow-based and agentic training paradigms. However, the analysis explicitly acknowledges its limited scope, examining top-K semantic matches rather than exhaustive coverage. The absence of refutable candidates may reflect either genuine novelty in this specific formulation or gaps in the search strategy's coverage of related work in adjacent areas such as reinforcement learning from demonstrations or curriculum learning for code generation.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors develop a comprehensive Agentless training recipe consisting of mid-training, cold-start supervised finetuning, reinforcement learning with outcome-based rewards, and test-time self-play. This recipe produces Kimi-Dev, which achieves state-of-the-art performance among workflow-based methods on SWE-bench Verified.
The authors propose a novel perspective that Agentless training induces transferable skill priors (localization, code editing, self-reflection) which enable efficient adaptation to multi-turn SWE-Agent frameworks. This challenges the dichotomy between workflow-based and agentic approaches by treating them as complementary training stages.
The authors demonstrate that minimal supervised finetuning (5k trajectories) on top of Agentless-trained Kimi-Dev enables competitive SWE-Agent performance. This shows that structured skill priors from Agentless training transfer effectively to end-to-end agentic frameworks with significantly reduced data requirements.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
Contribution Analysis
Detailed comparisons for each claimed contribution
Agentless training recipe for SWE tasks
The authors develop a comprehensive Agentless training recipe consisting of mid-training, cold-start supervised finetuning, reinforcement learning with outcome-based rewards, and test-time self-play. This recipe produces Kimi-Dev, which achieves state-of-the-art performance among workflow-based methods on SWE-bench Verified.
[2] Kimi k1. 5: Scaling reinforcement learning with llms PDF
[3] A comparison of reinforcement learning frameworks for software testing tasks PDF
[4] Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning PDF
[5] Swe-rl: Advancing llm reasoning via reinforcement learning on open software evolution PDF
[6] Open-reasoner-zero: An open source approach to scaling up reinforcement learning on the base model PDF
[7] Agent-RLVR: Training Software Engineering Agents via Guidance and Environment Rewards PDF
[8] Kimi k2: Open agentic intelligence PDF
[9] RLocator: Reinforcement learning for bug localization PDF
[10] Reinforcement learning for test case prioritization PDF
[11] The role of Reinforcement Learning in software testing PDF
Skill priors framework bridging Agentless and agentic paradigms
The authors propose a novel perspective that Agentless training induces transferable skill priors (localization, code editing, self-reflection) which enable efficient adaptation to multi-turn SWE-Agent frameworks. This challenges the dichotomy between workflow-based and agentic approaches by treating them as complementary training stages.
Efficient SWE-Agent adaptation from Agentless priors
The authors demonstrate that minimal supervised finetuning (5k trajectories) on top of Agentless-trained Kimi-Dev enables competitive SWE-Agent performance. This shows that structured skill priors from Agentless training transfer effectively to end-to-end agentic frameworks with significantly reduced data requirements.