Premise Selection for a Lean Hammer
Overview
Overall Novelty Assessment
The paper introduces LeanPremise, a neural premise selection system, and LeanHammer, an end-to-end hammer for Lean. It resides in the 'Lean-Based Systems' leaf under 'Retrieval-Augmented Proof Assistants', which contains four papers total. This leaf sits within a moderately populated branch of integrated systems for proof assistants, suggesting active but not overcrowded research. The work targets a specific niche: combining premise selection with translation and proof reconstruction into a unified hammer tool for Lean's dependent type theory.
The taxonomy reveals neighboring leaves for Coq-Based Systems (three papers) and Other Proof Assistant Systems (four papers), indicating parallel efforts across different proof assistants. The parent category 'Retrieval-Augmented Proof Assistants' excludes automated theorem provers without interactive assistance, clarifying that this work emphasizes integration with user workflows rather than standalone automation. Sibling papers like LeanDojo and Lean Copilot focus on data extraction and tactic generation respectively, while this work emphasizes premise retrieval tailored for hammer use, suggesting complementary rather than overlapping goals.
Among 26 candidates examined, the analysis found one refutable pair for the 'end-to-end hammer' contribution (examined six candidates), while the neural premise selection system (ten candidates examined) and hammer-aware data extraction techniques (ten candidates examined) showed no clear refutations. The limited search scope means these statistics reflect top-K semantic matches rather than exhaustive coverage. The premise selection component appears more novel within this sample, whereas the hammer integration claim faces at least one prior work challenge, though the scale of examination remains modest.
Given the limited literature search (26 candidates from semantic retrieval), the work appears to occupy a recognizable but not densely populated research direction. The taxonomy structure suggests this is an active area with established sibling systems, yet the specific combination of hammer-aware training and dynamic context adaptation may differentiate it. A broader search would be needed to assess whether similar hammer implementations exist outside the top-K matches examined here.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors develop LEANPREMISE, a neural premise selection tool specifically designed for use with a hammer in dependent type theory. Unlike existing Lean premise selectors, it is trained for hammer integration and dynamically adapts to user-specific contexts, enabling effective recommendation of premises from libraries outside its training data as well as user-defined local lemmas.
The authors combine LEANPREMISE with Lean-auto (translation tool), Duper (proof-producing tactic), and Aesop (proof search tool) to create LEANHAMMER, which is the first domain-general hammer for the Lean proof assistant. This unified pipeline integrates premise selection, translation to external automatic theorem provers, and proof reconstruction.
The authors develop novel data extraction methods specifically designed for hammer integration, including normalized signature serialization, extraction from both term-style and tactic-style proofs, collection of implicit premises from automation, and training the model to select premises for closing goals rather than just modifying them.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[4] LeanDojo: Theorem Proving with Retrieval-Augmented Language Models PDF
[6] Lean copilot: Large language models as copilots for theorem proving in lean PDF
[29] Machine-Learned Premise Selection for Lean PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
LEANPREMISE: Neural premise selection system for Lean hammer
The authors develop LEANPREMISE, a neural premise selection tool specifically designed for use with a hammer in dependent type theory. Unlike existing Lean premise selectors, it is trained for hammer integration and dynamically adapts to user-specific contexts, enabling effective recommendation of premises from libraries outside its training data as well as user-defined local lemmas.
[23] Learning proof search in proof assistants PDF
[51] Hammer for Coq: Automation for dependent type theory PDF
[52] Integrating Deep Neural Networks with Dependent Type Semantics PDF
[53] Towards neural synthesis for smt-assisted proof-oriented programming PDF
[54] Holist: An environment for machine learning of higher order logic theorem proving PDF
[55] Neural Networks for Mathematical ReasoningâEvaluations, Capabilities, and Techniques PDF
[56] Learning Structure-Aware Representations of Dependent Types PDF
[57] Proof searching and prediction in HOL4 with evolutionary/heuristic and deep learning techniques PDF
[58] Learning-Assisted Reasoning within Proof Assistants via Symbolic, Statistical, and Neural Guidance PDF
[59] Dependent type networks: a probabilistic logic via the curry-howard correspondence in a system of probabilistic dependent types PDF
LEANHAMMER: First end-to-end domain general hammer for Lean
The authors combine LEANPREMISE with Lean-auto (translation tool), Duper (proof-producing tactic), and Aesop (proof search tool) to create LEANHAMMER, which is the first domain-general hammer for the Lean proof assistant. This unified pipeline integrates premise selection, translation to external automatic theorem provers, and proof reconstruction.
[51] Hammer for Coq: Automation for dependent type theory PDF
[38] Automated Theorem Proving for Metamath PDF
[43] Premise Selection and External Provers for HOL4 PDF
[46] The Isabelle ENIGMA PDF
[64] Language Models for Verifiable Mathematical Automation Interaction, Integration, and Autoformalization PDF
[65] Goal translation for a hammer for Coq PDF
Hammer-aware data extraction techniques
The authors develop novel data extraction methods specifically designed for hammer integration, including normalized signature serialization, extraction from both term-style and tactic-style proofs, collection of implicit premises from automation, and training the model to select premises for closing goals rather than just modifying them.