HARBOR: Hierarchical Abduction with Bayesian Orchestration for Reliable Probability Inference in Large Language Models

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 5.5 Download Report PDF

Probability EstimationLarge Language Model

A central challenge in large-scale decision-making under incomplete information is the estimation of reliable probabilities. Prior work has employed Large Language Models (LLMs) to generate relevant factors and provide initial, coarse-grained probability estimates. These methods typically utilize an LLM for forward abduction to generate factors, with each factor constrained to two mutually exclusive attributes. A Naïve Bayes model is then trained on combinations of these factors to provide more accurate probabilities. However, this approach often yields a sparse factor space, resulting in "unknown" predictions where the model fails to produce an output. Naively increasing the number of factors to densify the space not only introduces statistical noise but also violates the Naïve Bayes independence assumption, ultimately compromising the stability and reliability of the estimates. To address these limitations, we propose Harbor, a novel inference framework that orchestrates aggregated Bayesian inference over a hierarchically structured factor space. Harbor first constructs a dense, structured factor space through iterative generation and hierarchical clustering. It then performs context-aware mapping using retrieval and refinement operations on this hierarchy to reduce "unknown" predictions. Finally, Harbor extends Naïve Bayes by incorporating a Causal Bayesian Network to model latent dependencies, thereby relaxing the strict independence assumption. Experiments show that Harbor substantially reduces "unknown'' predictions and yields more reliable probabilities than direct LLM baselines, achieving state-of-the-art performance with significantly reduced time and token overhead.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes Harbor, a framework for reliable probability estimation under incomplete information using hierarchical Bayesian inference over structured factor spaces. It resides in the 'Hierarchical Bayesian and Factor-Based Inference' leaf, which contains only three papers total, including Harbor itself. This makes it a relatively sparse research direction within the broader taxonomy of 37 papers. The sibling works—Fine-grained Probability Estimation and BIRD—similarly decompose complex queries into structured sub-problems, suggesting Harbor operates in a niche but emerging area focused on principled probabilistic reasoning rather than post-hoc calibration.

The taxonomy reveals that Harbor's approach diverges from the more populated 'Calibration Methods' branch (14 papers across three leaves) and 'Uncertainty Quantification' branch (10 papers across four leaves). While calibration methods like Batch Calibration and Prototypical Calibration adjust model outputs without explicit probabilistic structure, Harbor constructs hierarchical factor graphs to propagate uncertainty. The 'Domain-Specific Applications' branch (12 papers) addresses grounding in imperfect knowledge but typically lacks the formal Bayesian machinery Harbor employs. This positioning suggests Harbor bridges structured inference with practical incomplete-information scenarios, occupying a distinct methodological niche.

Among 30 candidates examined, none clearly refute any of Harbor's three core contributions: hierarchical factor-space construction (10 candidates, 0 refutable), causal Bayesian networks for latent dependencies (10 candidates, 0 refutable), and the overall framework with context-aware mapping (10 candidates, 0 refutable). The hierarchical factor construction and Bayesian orchestration appear particularly novel within this limited search scope. However, the sibling papers in the same taxonomy leaf share conceptual overlap in structured decomposition, indicating that while Harbor's specific mechanisms may be new, the general philosophy of factor-based probabilistic reasoning has precedent in this small cluster.

Based on the top-30 semantic matches examined, Harbor appears to introduce novel technical mechanisms—iterative bottom-up abduction, hierarchical clustering of factors, and aggregated Bayesian inference—that distinguish it from both lightweight calibration methods and existing structured inference approaches. The sparse population of its taxonomy leaf and absence of clear refutations suggest meaningful contribution, though the limited search scope leaves open the possibility of overlooked prior work in broader probabilistic AI or decision-making literature not captured by this semantic search.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: Reliable probability estimation from large language models under incomplete information. The field organizes around several complementary branches. Calibration Methods for Language Model Outputs focus on post-hoc adjustments and training-time interventions to align model confidence with accuracy, including techniques like Batch Calibration[1] and Calibration In-Context Learning[3]. Uncertainty Quantification and Awareness explores how models represent and communicate their epistemic limits, with works such as Know the Unknown[5] and Rethinking Uncertainty[18] examining awareness mechanisms. Structured Probabilistic Inference and Bayesian Approaches leverage formal probabilistic frameworks to handle incomplete information through hierarchical models and factor-based reasoning. Domain-Specific Applications and Grounding address calibration challenges in specialized contexts like medical diagnosis or scientific reasoning, while Visualization, Interface Design, and Foundational Surveys provide user-facing tools and comprehensive overviews of the landscape. Within the structured inference branch, a handful of works pursue rigorous probabilistic modeling to manage missing or uncertain inputs. HARBOR[0] sits squarely in this space, employing hierarchical Bayesian reasoning to estimate probabilities when context is incomplete, closely aligning with Fine-grained Probability Estimation[13] and BIRD[20], which similarly decompose complex queries into structured sub-problems. These approaches contrast with calibration-focused methods like Calibration Pretrained Models[2] or Prototypical Calibration[7], which adjust outputs without explicit probabilistic structure. A central tension across branches involves the trade-off between computational overhead of full Bayesian inference and the simplicity of post-hoc recalibration. HARBOR[0] emphasizes principled uncertainty propagation through factor graphs, distinguishing it from lighter-weight calibration techniques while sharing the structured decomposition philosophy of neighboring works in its taxonomy cluster.

Claimed Contributions

Hierarchical factor-space construction via iterative bottom-up abduction

10 retrieved papers

The authors propose a bottom-up abduction strategy that iteratively generates comprehensive factors through contextual sentence generation and factor harvesting, then organizes them into a two-level hierarchy using clustering and LLM-guided theming. This approach addresses the sparsity problem in existing forward abduction methods.

10 retrieved papers

Causal Bayesian Network for modeling latent factor dependencies

10 retrieved papers

The framework introduces a Latent-Augmented Causal Bayesian Network that uses an LLM as a causal discovery engine to identify latent variables and partition factors among them. This relaxes the strict conditional independence assumption of Naïve Bayes by making factors independent only when conditioned on their latent parent.

10 retrieved papers

HARBOR framework with context-aware hierarchical mapping and Bayesian orchestration

10 retrieved papers

HARBOR is a three-stage framework combining hierarchical factor-space construction, context-aware mapping through coarse-to-fine retrieval with self-consistent filtering and reflective refinement, and probabilistic inference that aggregates predictions from both Naïve Bayes and Causal Bayesian Network models using Linear Opinion Pool.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[13] Always Tell Me The Odds: Fine-grained Conditional Probability Estimation PDF

Jiang Zheng-ping, Liaoyaqi Wang, Liu Anqi, Zhengping Jiang, Van Durme, Benjamin, Anqi Liu, Benjamin Van Durme (2025)

[20] BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models PDF

Feng Yu, Zhou Ben, Yu Feng, Lin Wei-dong, Ben Zhou, Roth Dan, Weidong Lin, Dan Roth (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Hierarchical factor-space construction via iterative bottom-up abduction

[48] Generalized network psychometrics: Combining network and latent variable models PDF

Cannot Refute

[49] Towards interpretable deep generative models via causal representation learning PDF

Cannot Refute

[50] Search-Based Correction of Reasoning Chains for Language Models PDF

Cannot Refute

[51] Mathematical Reasoning in Latent Space PDF

Cannot Refute

[52] Latent Veracity Inference for Identifying Errors in Stepwise Reasoning PDF

Cannot Refute

[53] Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners PDF

Cannot Refute

[54] Stepwise estimation of latent variable models: An overview of approaches PDF

Cannot Refute

[55] Stepwise Latent Vector Autoregression PDF

Cannot Refute

[56] Trustworthy and Explainable Offline Reinforcement Learning by Inferring a Discrete-State Discrete-Action MDP from a Continous-State Continuous-Action â¦ PDF

Cannot Refute

[57] Predicting disease complications using a stepwise hidden variable approach for learning dynamic bayesian networks PDF

Cannot Refute

Contribution

Causal Bayesian Network for modeling latent factor dependencies

[38] Inferring Parameters and Structure of Latent Variable Models by Variational Bayes PDF

Cannot Refute

[39] Causal Inference in the Presence of Latent Variables and Selection Bias PDF

Cannot Refute

[40] Causal Bayesian Networks for Causal AI Using pgmpy PDF

Cannot Refute

[41] Multi-trait phenotypic modeling through factor analysis and bayesian network learning to develop latent reproductive, body conformational, and carcass-associated traits in admixed beef heifers PDF

Cannot Refute

[42] Bayesian causal graphical model for joint Mendelian randomization analysis of multiple exposures and outcomes PDF

Cannot Refute

[43] Interpretable knowledge tracing via transformer-Bayesian hybrid networks: Learning temporal dependencies and causal structures in educational data PDF

Cannot Refute

[44] Finding Optimal Bayesian Networks PDF

Cannot Refute

[45] Causal Inference for Latent Outcomes Learned with Factor Models PDF

Cannot Refute

[46] Causal effects of place, people, and process on rooftop solar adoption through Bayesian inference PDF

Cannot Refute

[47] Mplus: A general latent variable modeling program PDF

Cannot Refute

Contribution

HARBOR framework with context-aware hierarchical mapping and Bayesian orchestration

[58] Adaptive knowledge assessment via symmetric hierarchical Bayesian neural networks with graph symmetry-aware concept dependencies PDF

Cannot Refute

[59] Tracking human skill learning with a hierarchical Bayesian sequence model PDF

Cannot Refute

[60] Information flow in context-dependent hierarchical Bayesian inference PDF

Cannot Refute

[61] Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly PDF

Cannot Refute

[62] Bayesian decoding of brain images PDF

Cannot Refute

[63] Information Fusion in Smart Agriculture: Machine Learning Applications and Future Research Directions PDF

Cannot Refute

[64] Context and hierarchy in a probabilistic image model PDF

Cannot Refute

[65] Hierarchical Bayesian word sense disambiguation for mapping context space to sense space PDF

Cannot Refute

[66] When Experts Speak: Sequential LLM-Bayesian Learning for Startup Success Prediction PDF

Cannot Refute

[67] Bayesian Hierarchical Models for Meaning Representation PDF

Cannot Refute

HARBOR: Hierarchical Abduction with Bayesian Orchestration for Reliable Probability Inference in Large Language Models

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[13] Always Tell Me The Odds: Fine-grained Conditional Probability Estimation PDF

[20] BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models PDF

Contribution Analysis

Hierarchical factor-space construction via iterative bottom-up abduction

[48] Generalized network psychometrics: Combining network and latent variable models PDF

[49] Towards interpretable deep generative models via causal representation learning PDF

[50] Search-Based Correction of Reasoning Chains for Language Models PDF

[51] Mathematical Reasoning in Latent Space PDF

[52] Latent Veracity Inference for Identifying Errors in Stepwise Reasoning PDF

[53] Large Reasoning Models Are (Not Yet) Multilingual Latent Reasoners PDF

[54] Stepwise estimation of latent variable models: An overview of approaches PDF

[55] Stepwise Latent Vector Autoregression PDF

[56] Trustworthy and Explainable Offline Reinforcement Learning by Inferring a Discrete-State Discrete-Action MDP from a Continous-State Continuous-Action â¦ PDF

[57] Predicting disease complications using a stepwise hidden variable approach for learning dynamic bayesian networks PDF

Causal Bayesian Network for modeling latent factor dependencies

[38] Inferring Parameters and Structure of Latent Variable Models by Variational Bayes PDF

[39] Causal Inference in the Presence of Latent Variables and Selection Bias PDF

[40] Causal Bayesian Networks for Causal AI Using pgmpy PDF

[41] Multi-trait phenotypic modeling through factor analysis and bayesian network learning to develop latent reproductive, body conformational, and carcass-associated traits in admixed beef heifers PDF

[42] Bayesian causal graphical model for joint Mendelian randomization analysis of multiple exposures and outcomes PDF

[43] Interpretable knowledge tracing via transformer-Bayesian hybrid networks: Learning temporal dependencies and causal structures in educational data PDF

[44] Finding Optimal Bayesian Networks PDF

[45] Causal Inference for Latent Outcomes Learned with Factor Models PDF

[46] Causal effects of place, people, and process on rooftop solar adoption through Bayesian inference PDF

[47] Mplus: A general latent variable modeling program PDF

HARBOR framework with context-aware hierarchical mapping and Bayesian orchestration

[58] Adaptive knowledge assessment via symmetric hierarchical Bayesian neural networks with graph symmetry-aware concept dependencies PDF

[59] Tracking human skill learning with a hierarchical Bayesian sequence model PDF

[60] Information flow in context-dependent hierarchical Bayesian inference PDF

[61] Language-Informed Synthesis of Rational Agent Models for Grounded Theory-of-Mind Reasoning On-The-Fly PDF

[62] Bayesian decoding of brain images PDF

[63] Information Fusion in Smart Agriculture: Machine Learning Applications and Future Research Directions PDF

[64] Context and hierarchy in a probabilistic image model PDF

[65] Hierarchical Bayesian word sense disambiguation for mapping context space to sense space PDF

[66] When Experts Speak: Sequential LLM-Bayesian Learning for Startup Success Prediction PDF

[67] Bayesian Hierarchical Models for Meaning Representation PDF

Table of Contents

[56] Trustworthy and Explainable Offline Reinforcement Learning by Inferring a Discrete-State Discrete-Action MDP from a Continous-State Continuous-Action â¦ PDF