Abstract:

A central challenge in large-scale decision-making under incomplete information is the estimation of reliable probabilities. Prior work has employed Large Language Models (LLMs) to generate relevant factors and provide initial, coarse-grained probability estimates. These methods typically utilize an LLM for forward abduction to generate factors, with each factor constrained to two mutually exclusive attributes. A Naïve Bayes model is then trained on combinations of these factors to provide more accurate probabilities. However, this approach often yields a sparse factor space, resulting in "unknown" predictions where the model fails to produce an output. Naively increasing the number of factors to densify the space not only introduces statistical noise but also violates the Naïve Bayes independence assumption, ultimately compromising the stability and reliability of the estimates. To address these limitations, we propose Harbor, a novel inference framework that orchestrates aggregated Bayesian inference over a hierarchically structured factor space. Harbor first constructs a dense, structured factor space through iterative generation and hierarchical clustering. It then performs context-aware mapping using retrieval and refinement operations on this hierarchy to reduce "unknown" predictions. Finally, Harbor extends Naïve Bayes by incorporating a Causal Bayesian Network to model latent dependencies, thereby relaxing the strict independence assumption. Experiments show that Harbor substantially reduces "unknown'' predictions and yields more reliable probabilities than direct LLM baselines, achieving state-of-the-art performance with significantly reduced time and token overhead.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes Harbor, a framework for reliable probability estimation under incomplete information using hierarchical Bayesian inference over structured factor spaces. It resides in the 'Hierarchical Bayesian and Factor-Based Inference' leaf, which contains only three papers total, including Harbor itself. This makes it a relatively sparse research direction within the broader taxonomy of 37 papers. The sibling works—Fine-grained Probability Estimation and BIRD—similarly decompose complex queries into structured sub-problems, suggesting Harbor operates in a niche but emerging area focused on principled probabilistic reasoning rather than post-hoc calibration.

The taxonomy reveals that Harbor's approach diverges from the more populated 'Calibration Methods' branch (14 papers across three leaves) and 'Uncertainty Quantification' branch (10 papers across four leaves). While calibration methods like Batch Calibration and Prototypical Calibration adjust model outputs without explicit probabilistic structure, Harbor constructs hierarchical factor graphs to propagate uncertainty. The 'Domain-Specific Applications' branch (12 papers) addresses grounding in imperfect knowledge but typically lacks the formal Bayesian machinery Harbor employs. This positioning suggests Harbor bridges structured inference with practical incomplete-information scenarios, occupying a distinct methodological niche.

Among 30 candidates examined, none clearly refute any of Harbor's three core contributions: hierarchical factor-space construction (10 candidates, 0 refutable), causal Bayesian networks for latent dependencies (10 candidates, 0 refutable), and the overall framework with context-aware mapping (10 candidates, 0 refutable). The hierarchical factor construction and Bayesian orchestration appear particularly novel within this limited search scope. However, the sibling papers in the same taxonomy leaf share conceptual overlap in structured decomposition, indicating that while Harbor's specific mechanisms may be new, the general philosophy of factor-based probabilistic reasoning has precedent in this small cluster.

Based on the top-30 semantic matches examined, Harbor appears to introduce novel technical mechanisms—iterative bottom-up abduction, hierarchical clustering of factors, and aggregated Bayesian inference—that distinguish it from both lightweight calibration methods and existing structured inference approaches. The sparse population of its taxonomy leaf and absence of clear refutations suggest meaningful contribution, though the limited search scope leaves open the possibility of overlooked prior work in broader probabilistic AI or decision-making literature not captured by this semantic search.

Taxonomy

Core-task Taxonomy Papers
37
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Reliable probability estimation from large language models under incomplete information. The field organizes around several complementary branches. Calibration Methods for Language Model Outputs focus on post-hoc adjustments and training-time interventions to align model confidence with accuracy, including techniques like Batch Calibration[1] and Calibration In-Context Learning[3]. Uncertainty Quantification and Awareness explores how models represent and communicate their epistemic limits, with works such as Know the Unknown[5] and Rethinking Uncertainty[18] examining awareness mechanisms. Structured Probabilistic Inference and Bayesian Approaches leverage formal probabilistic frameworks to handle incomplete information through hierarchical models and factor-based reasoning. Domain-Specific Applications and Grounding address calibration challenges in specialized contexts like medical diagnosis or scientific reasoning, while Visualization, Interface Design, and Foundational Surveys provide user-facing tools and comprehensive overviews of the landscape. Within the structured inference branch, a handful of works pursue rigorous probabilistic modeling to manage missing or uncertain inputs. HARBOR[0] sits squarely in this space, employing hierarchical Bayesian reasoning to estimate probabilities when context is incomplete, closely aligning with Fine-grained Probability Estimation[13] and BIRD[20], which similarly decompose complex queries into structured sub-problems. These approaches contrast with calibration-focused methods like Calibration Pretrained Models[2] or Prototypical Calibration[7], which adjust outputs without explicit probabilistic structure. A central tension across branches involves the trade-off between computational overhead of full Bayesian inference and the simplicity of post-hoc recalibration. HARBOR[0] emphasizes principled uncertainty propagation through factor graphs, distinguishing it from lighter-weight calibration techniques while sharing the structured decomposition philosophy of neighboring works in its taxonomy cluster.

Claimed Contributions

Hierarchical factor-space construction via iterative bottom-up abduction

The authors propose a bottom-up abduction strategy that iteratively generates comprehensive factors through contextual sentence generation and factor harvesting, then organizes them into a two-level hierarchy using clustering and LLM-guided theming. This approach addresses the sparsity problem in existing forward abduction methods.

10 retrieved papers
Causal Bayesian Network for modeling latent factor dependencies

The framework introduces a Latent-Augmented Causal Bayesian Network that uses an LLM as a causal discovery engine to identify latent variables and partition factors among them. This relaxes the strict conditional independence assumption of Naïve Bayes by making factors independent only when conditioned on their latent parent.

10 retrieved papers
HARBOR framework with context-aware hierarchical mapping and Bayesian orchestration

HARBOR is a three-stage framework combining hierarchical factor-space construction, context-aware mapping through coarse-to-fine retrieval with self-consistent filtering and reflective refinement, and probabilistic inference that aggregates predictions from both Naïve Bayes and Causal Bayesian Network models using Linear Opinion Pool.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Hierarchical factor-space construction via iterative bottom-up abduction

The authors propose a bottom-up abduction strategy that iteratively generates comprehensive factors through contextual sentence generation and factor harvesting, then organizes them into a two-level hierarchy using clustering and LLM-guided theming. This approach addresses the sparsity problem in existing forward abduction methods.

Contribution

Causal Bayesian Network for modeling latent factor dependencies

The framework introduces a Latent-Augmented Causal Bayesian Network that uses an LLM as a causal discovery engine to identify latent variables and partition factors among them. This relaxes the strict conditional independence assumption of Naïve Bayes by making factors independent only when conditioned on their latent parent.

Contribution

HARBOR framework with context-aware hierarchical mapping and Bayesian orchestration

HARBOR is a three-stage framework combining hierarchical factor-space construction, context-aware mapping through coarse-to-fine retrieval with self-consistent filtering and reflective refinement, and probabilistic inference that aggregates predictions from both Naïve Bayes and Causal Bayesian Network models using Linear Opinion Pool.