Learning Admissible Heuristics for A*: Theory and Practice

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.7 Download Report PDF

Admissible HeuristicsA* Search AlgorithmOptimal searchGeneralization GuaranteesRubik’s Cube

Heuristic functions are central to the performance of search algorithms such as A*, where \emph{admissibility}—the property of never overestimating the true shortest-path cost—guarantees solution optimality. Recent deep learning approaches often disregard full admissibility and provide limited guarantees on generalization beyond the training data. We address both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce \emph{Cross-Entropy Admissibility (CEA)}, a loss function that enforces admissibility during training. When evaluated on the Rubik’s Cube domain, our method yields heuristics with near-perfect admissibility and significantly stronger guidance than compressed pattern database (PDB) heuristics. On the theoretical side, we derive a new upper bound on the expected suboptimality of A*. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik’s Cube, we tighten the bound on the number of training samples needed for A* to generalize to unseen states. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network’s width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for \emph{goal-dependent} heuristics.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces a Cross-Entropy Admissibility (CEA) loss function to enforce strict admissibility during neural heuristic training and derives tightened sample complexity bounds for A* generalization. It resides in the 'Admissibility Enforcement and Verification' leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy of fifty papers. This leaf sits under 'Theoretical Foundations and Guarantees,' distinguishing it from purely empirical methods that often sacrifice admissibility for expressiveness.

The taxonomy reveals neighboring leaves addressing approximate admissibility ('Approximate and Relaxed Heuristics') and generalization theory ('Generalization Theory and Sample Complexity'), both closely related but distinct. The paper bridges these areas by combining strict admissibility enforcement with theoretical generalization guarantees, whereas sibling papers like Neural Admissible Heuristics focus primarily on architectural constraints or post-hoc verification. Empirical learning branches (e.g., 'Supervised Learning Approaches') contain more papers but typically lack the formal guarantees emphasized here, highlighting a methodological divide between theory-driven and data-driven approaches.

Among twenty-five candidates examined, none clearly refute the three main contributions. The CEA loss function (ten candidates examined, zero refutable) and goal-dependent generalization guarantees (ten candidates, zero refutable) appear particularly novel within the limited search scope. The tightened sample complexity bounds (five candidates, zero refutable) also show no direct overlap, though the small candidate pool means this analysis captures top semantic matches rather than an exhaustive literature review. The absence of refutations suggests these contributions occupy underexplored niches, though broader searches might reveal additional related work.

Based on the top-twenty-five semantic matches and the sparse taxonomy leaf, the work appears to make substantive theoretical advances in a less-crowded research direction. The combination of strict admissibility enforcement via CEA and formal generalization bounds distinguishes it from both purely empirical methods and classical abstraction-based heuristics. However, the limited search scope means this assessment reflects proximity to the examined candidates rather than a comprehensive field survey.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: learning admissible heuristics for heuristic search algorithms. The field is organized around six main branches that reflect different facets of this challenge. Theoretical Foundations and Guarantees focuses on ensuring that learned heuristics never overestimate true costs, a property essential for optimality in algorithms like A*; works such as Neural Admissible Heuristics[3] and Admissible Bounds Learning[2] exemplify efforts to enforce or verify admissibility during training. Empirical Learning Methods encompasses data-driven approaches—ranging from deep reinforcement learning (Deep RL Heuristics[5]) to imitation-based techniques (Imitation Heuristic Learning[13])—that extract heuristic functions from experience. Classical Heuristic Design revisits traditional relaxation and abstraction methods (e.g., Delete-Relaxation Lifted[16], Domain-Independent Heuristics[14]), while Search Algorithm Integration examines how learned heuristics interact with search procedures like bidirectional A* (Bidirectional A-Star[30]) or policy-guided variants (Policy-Guided Search[29]). Specialized Problem Classes targets settings with unique structure, such as POMDPs (HSVI POMDPs[6]) or kinodynamic planning (Kinodynamic Heuristic Verification[24]), and Application Domains explores deployment in robotics (Mobile Robot Heuristics[15]), routing (Routing Heuristic Function[21]), and games (StarCraft Heuristic Analysis[50]). A particularly active line of work centers on reconciling the expressiveness of neural networks with the strict admissibility requirement. Neural Admissible Heuristics[3] and Admissible Bounds Learning[2] both tackle the challenge of constraining learned models to produce valid lower bounds, yet they differ in how tightly they integrate verification into the training loop versus post-hoc certification. Learning Admissible Heuristics[0] sits squarely within this Theoretical Foundations cluster, emphasizing admissibility enforcement and verification mechanisms that guarantee optimality. Compared to Neural Admissible Heuristics[3], which may prioritize architectural design for admissibility, and Kinodynamic Heuristic Verification[24], which addresses motion-planning constraints, Learning Admissible Heuristics[0] appears to offer a more general framework for ensuring that diverse learning methods yield provably safe heuristics. This positioning highlights an ongoing tension: balancing the flexibility of modern learning techniques with the rigorous guarantees that classical search algorithms demand.

Claimed Contributions

Cross-Entropy Admissibility (CEA) loss function

10 retrieved papers

The authors propose a novel loss function called Cross-Entropy Admissibility (CEA) that enforces admissibility constraints during training of heuristic functions. This loss function is designed to learn near-admissible heuristics that provide stronger guidance than traditional compressed pattern database heuristics.

10 retrieved papers

Tightened sample complexity bounds for learning heuristics

5 retrieved papers

The authors derive improved theoretical bounds on the number of training samples required for A* to generalize effectively. They achieve tighter bounds by exploiting pattern database abstractions and structural graph properties, with bounds that depend on network architecture rather than graph size when using neural networks.

5 retrieved papers

First generalization guarantees for goal-dependent heuristics

10 retrieved papers

The authors establish the first theoretical generalization guarantees for heuristics that can adapt to different goal states within the same graph, extending beyond the traditional fixed-goal setting.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

Chujie Xu, Carlos NÃºÃ±ez-Molina, Yuqing Ma, Masataro Asai, Carlos N'unez-Molina, Hao-Yuan Deng, Pablo Mesejo, Yajun Gao, Juan FernÃ¡ndez-Olivares, Yudie Wang, Juan Fern'andez-Olivares, Kai Lv, Xianglong Liu (2024)

[3] Learning Admissible Heuristics with Neural Networks PDF

Li Tianhua (2022)

[24] Verification and synthesis of admissible heuristics for kinodynamic motion planning PDF

B. Paden, Valerio Varricchio, Emilio Frazzoli, Brian Paden (2017)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Cross-Entropy Admissibility (CEA) loss function

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

Cannot Refute

[7] Admissible Heuristics for Multi-Objective Planning PDF

Cannot Refute

[29] Policy-Guided Heuristic Search with Guarantees PDF

Cannot Refute

[35] Learning Differentiable Programs with Admissible Neural Heuristics PDF

Cannot Refute

[56] A Differentiable Loss Function for Learning Heuristics in A PDF

Cannot Refute

[57] Q* search: Heuristic search with deep q-networks PDF

Cannot Refute

[58] Empowering a* algorithm with neuralized variational heuristics for fastest route recommendation PDF

Cannot Refute

[59] Optimize planning heuristics to rank, not to estimate cost-to-goal PDF

Cannot Refute

[60] Utilising uncertainty for efficient learning of likely-admissible heuristics PDF

Cannot Refute

[61] An analysis of the admissibility of the objective functions applied in evolutionary multi-objective clustering PDF

Cannot Refute

Contribution

Tightened sample complexity bounds for learning heuristics

[51] Structurally Restricted Fragments of Numeric Planning - a Complexity Analysis PDF

Cannot Refute

[52] Heuristic Learning in Domain-Independent Planning: Theoretical Analysis and Experimental Evaluation PDF

Cannot Refute

[53] Relaxation refinement: A new method to generate heuristic functions PDF

Cannot Refute

[54] Explicit-State Abstraction: A New Method for Generating Heuristic Functions. PDF

Cannot Refute

[55] Implicit Abstraction Heuristics PDF

Cannot Refute

Contribution

First generalization guarantees for goal-dependent heuristics

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

Cannot Refute

[19] A Best-First Search Algorithm for FOND Planning and Heuristic Functions to Optimize Decompressed Solution Size PDF

Cannot Refute

[62] Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects PDF

Cannot Refute

[63] Horizon Generalization in Reinforcement Learning PDF

Cannot Refute

[64] Ppaas: Pvt and pareto aware analog sizing via goal-conditioned reinforcement learning PDF

Cannot Refute

[65] What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? PDF

Cannot Refute

[66] Goal-oriented exploration for reinforcement learning PDF

Cannot Refute

[67] Learning domain invariant representations in goal-conditioned block mdps PDF

Cannot Refute

[68] A* Search and Bound-Sensitive Heuristics for Oversubscription Planning PDF

Cannot Refute

[69] Generalizing goal-conditioned reinforcement learning with variational causal reasoning PDF

Cannot Refute

Learning Admissible Heuristics for A*: Theory and Practice

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

[3] Learning Admissible Heuristics with Neural Networks PDF

[24] Verification and synthesis of admissible heuristics for kinodynamic motion planning PDF

Contribution Analysis

Cross-Entropy Admissibility (CEA) loss function

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

[7] Admissible Heuristics for Multi-Objective Planning PDF

[29] Policy-Guided Heuristic Search with Guarantees PDF

[35] Learning Differentiable Programs with Admissible Neural Heuristics PDF

[56] A Differentiable Loss Function for Learning Heuristics in A PDF

[57] Q* search: Heuristic search with deep q-networks PDF

[58] Empowering a* algorithm with neuralized variational heuristics for fastest route recommendation PDF

[59] Optimize planning heuristics to rank, not to estimate cost-to-goal PDF

[60] Utilising uncertainty for efficient learning of likely-admissible heuristics PDF

[61] An analysis of the admissibility of the objective functions applied in evolutionary multi-objective clustering PDF

Tightened sample complexity bounds for learning heuristics

[51] Structurally Restricted Fragments of Numeric Planning - a Complexity Analysis PDF

[52] Heuristic Learning in Domain-Independent Planning: Theoretical Analysis and Experimental Evaluation PDF

[53] Relaxation refinement: A new method to generate heuristic functions PDF

[54] Explicit-State Abstraction: A New Method for Generating Heuristic Functions. PDF

[55] Implicit Abstraction Heuristics PDF

First generalization guarantees for goal-dependent heuristics

[2] On Using Admissible Bounds for Learning Forward Search Heuristics PDF

[19] A Best-First Search Algorithm for FOND Planning and Heuristic Functions to Optimize Decompressed Solution Size PDF

[62] Iterative residual policy: for goal-conditioned dynamic manipulation of deformable objects PDF

[63] Horizon Generalization in Reinforcement Learning PDF

[64] Ppaas: Pvt and pareto aware analog sizing via goal-conditioned reinforcement learning PDF

[65] What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? PDF

[66] Goal-oriented exploration for reinforcement learning PDF

[67] Learning domain invariant representations in goal-conditioned block mdps PDF

[68] A* Search and Bound-Sensitive Heuristics for Oversubscription Planning PDF

[69] Generalizing goal-conditioned reinforcement learning with variational causal reasoning PDF

Table of Contents