Abstract:

Neural theorem proving has advanced rapidly in the past year, reaching IMO gold-medalist capabilities and producing formal proofs that span thousands of lines. Although such proofs are mechanically verified by formal systems like Lean, their excessive length renders them difficult for humans to comprehend and limits their usefulness for mathematical insight. Proof simplification is therefore a critical bottleneck. Yet, training data for this task is scarce, and existing methods—mainly agentic scaffolding with off-the-shelf LLMs—struggle with the extremely long proofs generated by RL-trained provers. We introduce ProofOptimizer, the first language model trained to simplify Lean proofs without requiring additional human supervision. ProofOptimizer is trained via expert iteration and reinforcement learning, using Lean to verify simplifications and provide training signal. At inference time, it operates within an iterative proof-shortening workflow, progressively reducing proof length. Experiments show that ProofOptimizer substantially compresses proofs generated by state-of-the-art RL-trained provers on standard benchmarks, reducing proof length by 87% on miniF2F, 57% on PutnamBench, and 50% on Seed-Prover's IMO 2025 proofs. Beyond conciseness, the simplified proofs check faster in Lean and further improve downstream prover performance when reused as training data for supervised finetuning.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

ProofOptimizer introduces the first language model trained specifically to simplify Lean proofs without additional human supervision, using expert iteration and reinforcement learning with Lean-based verification as the training signal. The paper sits in the 'Proof Simplification and Compression' leaf under 'Proof Optimization and Repair', which contains only two papers total in the entire taxonomy. This represents a notably sparse research direction within the broader field of 50 papers, suggesting that proof optimization has received far less attention than proof generation or autoformalization tasks.

The taxonomy reveals that most research effort concentrates in adjacent branches: 'Formal Proof Generation and Verification' contains multiple dense subtopics with 15 papers across six leaves, while 'Autoformalization and Translation' addresses informal-to-formal conversion with seven papers. The 'Proof Repair and Error Correction' sibling leaf focuses on fixing incorrect proofs rather than simplifying correct ones. ProofOptimizer's work diverges from these neighboring directions by assuming correct input proofs and targeting length reduction, rather than initial generation, translation, or error correction.

Among 30 candidates examined across three contributions, none were identified as clearly refuting the paper's claims. The first contribution (ProofOptimizer as first trained simplification model) examined 10 candidates with zero refutable matches. The training methodology and iterative inference workflow contributions each examined 10 candidates with similar results. This suggests that within the limited search scope, no prior work directly addresses supervised learning for proof simplification in Lean, though the small candidate pool means the search cannot be considered exhaustive.

Based on the limited literature search covering 30 semantically similar papers, ProofOptimizer appears to occupy a genuinely sparse research area. The taxonomy structure confirms that proof optimization receives minimal attention compared to proof generation. However, the analysis cannot rule out relevant work outside the top-30 semantic matches or in adjacent communities not captured by this search methodology.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Simplifying formal mathematical proofs using language models. The field has evolved into a rich ecosystem of interconnected branches. Formal Proof Generation and Verification focuses on end-to-end theorem proving systems that synthesize and check proofs in systems like Lean or Coq, with works such as LeanDojo[10] and Deep Learning Theorem Proving[12] establishing foundational infrastructure. Autoformalization and Translation addresses the challenge of converting informal mathematical statements into formal syntax, exemplified by efforts like Autoformalize[18] and Multilingual Autoformalization[25]. Proof Optimization and Repair targets the refinement of existing proofs—making them shorter, more readable, or correcting errors—while Specialized Mathematical Domains tackle specific problem classes such as inequalities or trigonometry. Foundation Models and Pretraining, including Llemma[3], provide the base capabilities that other branches build upon, and Benchmarks and Datasets supply the evaluation infrastructure that drives progress across all areas. Within Proof Optimization and Repair, a small but growing cluster of works explores proof simplification and compression, seeking to reduce proof complexity while preserving correctness. ProofOptimizer[0] sits squarely in this niche, emphasizing automated techniques to streamline formal proofs. Nearby, FVEL[7] addresses related verification and efficiency concerns, though with a slightly different emphasis on validation workflows. This contrasts with broader proof generation efforts like Proof Automation[8] or MLFMF[9], which prioritize discovering new proofs over refining existing ones. The tension between generating correct proofs and making them human-readable or computationally efficient remains a central open question. ProofOptimizer[0] contributes to this dialogue by focusing specifically on simplification, complementing the wider landscape where most attention has centered on proof discovery and autoformalization rather than post-hoc optimization.

Claimed Contributions

ProofOptimizer: first language model trained to simplify Lean proofs without human supervision

The authors present ProofOptimizer, a language model specifically trained for proof simplification in Lean using expert iteration and reinforcement learning, without needing human-annotated simplification data. The model uses Lean's verification to provide training signals and operates within an iterative proof-shortening workflow at inference time.

10 retrieved papers
Training methodology combining expert iteration and reinforcement learning for proof simplification

The authors develop a training approach that combines expert iteration (where the model proposes simplifications verified by Lean and incorporated into training data) and online reinforcement learning (using proof length and correctness as reward signals) to enable continual improvement in proof simplification.

10 retrieved papers
Iterative proof-shortening inference workflow

The authors introduce an inference-time algorithm that iteratively applies the model to progressively shorten proofs by sampling multiple candidate simplifications and repeatedly applying the model to the currently shortest proof, achieving substantial compression on benchmark datasets.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

ProofOptimizer: first language model trained to simplify Lean proofs without human supervision

The authors present ProofOptimizer, a language model specifically trained for proof simplification in Lean using expert iteration and reinforcement learning, without needing human-annotated simplification data. The model uses Lean's verification to provide training signals and operates within an iterative proof-shortening workflow at inference time.

Contribution

Training methodology combining expert iteration and reinforcement learning for proof simplification

The authors develop a training approach that combines expert iteration (where the model proposes simplifications verified by Lean and incorporated into training data) and online reinforcement learning (using proof length and correctness as reward signals) to enable continual improvement in proof simplification.

Contribution

Iterative proof-shortening inference workflow

The authors introduce an inference-time algorithm that iteratively applies the model to progressively shorten proofs by sampling multiple candidate simplifications and repeatedly applying the model to the currently shortest proof, achieving substantial compression on benchmark datasets.