Learning Admissible Heuristics for A*: Theory and Practice

ICLR 2026 Conference SubmissionAnonymous Authors
Admissible HeuristicsA* Search AlgorithmOptimal searchGeneralization GuaranteesRubik’s Cube
Abstract:

Heuristic functions are central to the performance of search algorithms such as A*, where \emph{admissibility}—the property of never overestimating the true shortest-path cost—guarantees solution optimality. Recent deep learning approaches often disregard full admissibility and provide limited guarantees on generalization beyond the training data. We address both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce \emph{Cross-Entropy Admissibility (CEA)}, a loss function that enforces admissibility during training. When evaluated on the Rubik’s Cube domain, our method yields heuristics with near-perfect admissibility and significantly stronger guidance than compressed pattern database (PDB) heuristics. On the theoretical side, we derive a new upper bound on the expected suboptimality of A*. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik’s Cube, we tighten the bound on the number of training samples needed for A* to generalize to unseen states. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network’s width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for \emph{goal-dependent} heuristics.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces a Cross-Entropy Admissibility (CEA) loss function to enforce strict admissibility during neural heuristic training and derives tightened sample complexity bounds for A* generalization. It resides in the 'Admissibility Enforcement and Verification' leaf, which contains only four papers total, indicating a relatively sparse research direction within the broader taxonomy of fifty papers. This leaf sits under 'Theoretical Foundations and Guarantees,' distinguishing it from purely empirical methods that often sacrifice admissibility for expressiveness.

The taxonomy reveals neighboring leaves addressing approximate admissibility ('Approximate and Relaxed Heuristics') and generalization theory ('Generalization Theory and Sample Complexity'), both closely related but distinct. The paper bridges these areas by combining strict admissibility enforcement with theoretical generalization guarantees, whereas sibling papers like Neural Admissible Heuristics focus primarily on architectural constraints or post-hoc verification. Empirical learning branches (e.g., 'Supervised Learning Approaches') contain more papers but typically lack the formal guarantees emphasized here, highlighting a methodological divide between theory-driven and data-driven approaches.

Among twenty-five candidates examined, none clearly refute the three main contributions. The CEA loss function (ten candidates examined, zero refutable) and goal-dependent generalization guarantees (ten candidates, zero refutable) appear particularly novel within the limited search scope. The tightened sample complexity bounds (five candidates, zero refutable) also show no direct overlap, though the small candidate pool means this analysis captures top semantic matches rather than an exhaustive literature review. The absence of refutations suggests these contributions occupy underexplored niches, though broader searches might reveal additional related work.

Based on the top-twenty-five semantic matches and the sparse taxonomy leaf, the work appears to make substantive theoretical advances in a less-crowded research direction. The combination of strict admissibility enforcement via CEA and formal generalization bounds distinguishes it from both purely empirical methods and classical abstraction-based heuristics. However, the limited search scope means this assessment reflects proximity to the examined candidates rather than a comprehensive field survey.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
25
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: learning admissible heuristics for heuristic search algorithms. The field is organized around six main branches that reflect different facets of this challenge. Theoretical Foundations and Guarantees focuses on ensuring that learned heuristics never overestimate true costs, a property essential for optimality in algorithms like A*; works such as Neural Admissible Heuristics[3] and Admissible Bounds Learning[2] exemplify efforts to enforce or verify admissibility during training. Empirical Learning Methods encompasses data-driven approaches—ranging from deep reinforcement learning (Deep RL Heuristics[5]) to imitation-based techniques (Imitation Heuristic Learning[13])—that extract heuristic functions from experience. Classical Heuristic Design revisits traditional relaxation and abstraction methods (e.g., Delete-Relaxation Lifted[16], Domain-Independent Heuristics[14]), while Search Algorithm Integration examines how learned heuristics interact with search procedures like bidirectional A* (Bidirectional A-Star[30]) or policy-guided variants (Policy-Guided Search[29]). Specialized Problem Classes targets settings with unique structure, such as POMDPs (HSVI POMDPs[6]) or kinodynamic planning (Kinodynamic Heuristic Verification[24]), and Application Domains explores deployment in robotics (Mobile Robot Heuristics[15]), routing (Routing Heuristic Function[21]), and games (StarCraft Heuristic Analysis[50]). A particularly active line of work centers on reconciling the expressiveness of neural networks with the strict admissibility requirement. Neural Admissible Heuristics[3] and Admissible Bounds Learning[2] both tackle the challenge of constraining learned models to produce valid lower bounds, yet they differ in how tightly they integrate verification into the training loop versus post-hoc certification. Learning Admissible Heuristics[0] sits squarely within this Theoretical Foundations cluster, emphasizing admissibility enforcement and verification mechanisms that guarantee optimality. Compared to Neural Admissible Heuristics[3], which may prioritize architectural design for admissibility, and Kinodynamic Heuristic Verification[24], which addresses motion-planning constraints, Learning Admissible Heuristics[0] appears to offer a more general framework for ensuring that diverse learning methods yield provably safe heuristics. This positioning highlights an ongoing tension: balancing the flexibility of modern learning techniques with the rigorous guarantees that classical search algorithms demand.

Claimed Contributions

Cross-Entropy Admissibility (CEA) loss function

The authors propose a novel loss function called Cross-Entropy Admissibility (CEA) that enforces admissibility constraints during training of heuristic functions. This loss function is designed to learn near-admissible heuristics that provide stronger guidance than traditional compressed pattern database heuristics.

10 retrieved papers
Tightened sample complexity bounds for learning heuristics

The authors derive improved theoretical bounds on the number of training samples required for A* to generalize effectively. They achieve tighter bounds by exploiting pattern database abstractions and structural graph properties, with bounds that depend on network architecture rather than graph size when using neural networks.

5 retrieved papers
First generalization guarantees for goal-dependent heuristics

The authors establish the first theoretical generalization guarantees for heuristics that can adapt to different goal states within the same graph, extending beyond the traditional fixed-goal setting.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Cross-Entropy Admissibility (CEA) loss function

The authors propose a novel loss function called Cross-Entropy Admissibility (CEA) that enforces admissibility constraints during training of heuristic functions. This loss function is designed to learn near-admissible heuristics that provide stronger guidance than traditional compressed pattern database heuristics.

Contribution

Tightened sample complexity bounds for learning heuristics

The authors derive improved theoretical bounds on the number of training samples required for A* to generalize effectively. They achieve tighter bounds by exploiting pattern database abstractions and structural graph properties, with bounds that depend on network architecture rather than graph size when using neural networks.

Contribution

First generalization guarantees for goal-dependent heuristics

The authors establish the first theoretical generalization guarantees for heuristics that can adapt to different goal states within the same graph, extending beyond the traditional fixed-goal setting.

Learning Admissible Heuristics for A*: Theory and Practice | Novelty Validation