Bayesian Primitive Distributing for Compositional Zero-shot Learning

ICLR 2026 Conference SubmissionAnonymous Authors
Compositional Zero-shot LearningProbability DistributionBayesian Inference
Abstract:

Compositional zero-shot learning (CZSL) aims to recognize unseen attribute-object combinations by learning primitive concepts (i.e., attribute and object) from seen compositions. Existing CZSL solutions typically harness the power of vision-language models like CLIP via textual prompt tuning and visual adapters. However, they independently learn one deterministic textual prompt for each primitive or compositional labels, ignoring both the inherent semantic diversity within each primitive and the semantic relationships between primitive concepts and their compositions. In this paper, we propose BAYECZSL, a novel Bayesian-induced framework that learns probability distributions over each primitive textual prompt from a Bayesian perspective. Specifically, BAYECZSL models image-specific primitive textual prompts as learnable probability distributions to capture intra-primitive diversity. Building on these primitive distributions, we aggregate learned probability distributions from attribute and object branches to form compositional prompt space via Compositional Distribution Synthesis strategy, thus capturing the semantic relationships between primitive concepts and their compositions. Moreover, Three-path Distribution Enhancement module is introduced to transform initial distributions into expressive ones via invertible mappings. Finally, these enhanced distributions are sampled to generate diverse textual prompts, achieving more comprehensive coverage of the prompt space and generalizing to unseen compositions. Extensive experiments on multiple CZSL benchmarks demonstrate the superiority of our BAYECZSL. Code will be released.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes BAYECZSL, a Bayesian framework that models primitive textual prompts (attributes and objects) as probability distributions rather than deterministic embeddings. According to the taxonomy, this work resides in the 'Bayesian Distribution Learning for Primitives' leaf under 'Probabilistic Primitive Prompt Modeling'. Notably, this leaf contains only the original paper itself—no sibling papers are listed—suggesting this specific Bayesian approach to primitive prompt distributions represents a relatively sparse research direction within the broader compositional zero-shot learning landscape.

The taxonomy reveals three main branches: Probabilistic Primitive Prompt Modeling (where this paper sits), Adaptive Prompt Generation and Disentanglement, and Cross-Domain Zero-Shot Anomaly Detection. The neighboring 'Primitive Relation Probabilistic Modeling' leaf contains one paper exploring dependencies between primitives, while 'Synergetic Disentanglement Query Prompting' and 'Language-Informed Distribution Prompting' each contain one paper focusing on dynamic prompt construction and linguistic priors respectively. The original paper's Bayesian stance on primitive distributions appears distinct from these alternative approaches to compositional reasoning, though all share the goal of improving generalization to unseen attribute-object pairs.

Among 27 candidates examined, the Bayesian framework contribution shows one refutable candidate out of 10 examined, while the Compositional Distribution Synthesis mechanism also has one refutable candidate among 7 examined. The Three-path Distribution Enhancement module appears more novel, with zero refutable candidates among 10 examined. These statistics suggest that while the core Bayesian modeling and compositional synthesis ideas have some precedent in the limited search scope, the specific enhancement mechanism may represent a more distinctive contribution. The relatively small candidate pool (27 total) means these findings reflect top semantic matches rather than exhaustive coverage.

Given the limited search scope of 27 candidates and the sparse taxonomy leaf (no siblings), the work appears to occupy a relatively unexplored niche within compositional zero-shot learning. The Bayesian approach to primitive prompt distributions shows some overlap with prior work, but the specific combination of contributions—particularly the enhancement module—may offer incremental advances. A broader literature search would be needed to definitively assess novelty beyond these top semantic matches.

Taxonomy

Core-task Taxonomy Papers
4
3
Claimed Contributions
27
Contribution Candidate Papers Compared
2
Refutable Paper

Research Landscape Overview

Core task: Compositional zero-shot learning with probabilistic primitive prompt distributions. This field addresses the challenge of recognizing novel attribute-object compositions by modeling primitives (attributes and objects) as probabilistic distributions rather than fixed embeddings. The taxonomy reveals three main branches: Probabilistic Primitive Prompt Modeling focuses on learning distributional representations of primitives to capture inherent uncertainty and variability; Adaptive Prompt Generation and Disentanglement emphasizes dynamic prompt construction and separating attribute-object information; and Cross-Domain Zero-Shot Anomaly Detection extends compositional reasoning to out-of-distribution scenarios. Works like Language-Informed Distribution[1] and Learning Primitive Relations[2] exemplify how probabilistic modeling can leverage linguistic priors and relational structures to improve compositional generalization. Within Probabilistic Primitive Prompt Modeling, a particularly active line explores Bayesian approaches to distribution learning. Bayesian Primitive Distributing[0] sits squarely in this cluster, employing Bayesian frameworks to model primitive distributions and capture uncertainty in compositional embeddings. This contrasts with methods like Bayesian Prompt Flow[3], which also adopts Bayesian principles but may emphasize flow-based generative modeling for prompt construction. Meanwhile, SPDQ[4] represents an alternative direction within the same branch, potentially focusing on quantization or discrete representations of probabilistic primitives. The central tension across these works involves balancing expressive distributional modeling with computational efficiency and interpretability, while the original paper's Bayesian stance offers a principled way to quantify uncertainty in unseen compositions.

Claimed Contributions

Bayesian-induced framework for learning probability distributions over primitive textual prompts

The authors introduce a Bayesian framework that models attribute and object textual prompts as probability distributions rather than single deterministic prompts. This approach captures intra-primitive diversity and semantic uncertainty, reducing overfitting to seen compositions and improving generalization to unseen attribute-object combinations.

10 retrieved papers
Can Refute
Compositional Distribution Synthesis mechanism

The authors propose a mechanism that combines learned probability distributions from attribute and object branches into a unified compositional prompt space. This captures semantic relationships between primitive concepts and their compositions, addressing the limitation of treating prompts independently.

7 retrieved papers
Can Refute
Three-path Distribution Enhancement module

The authors develop a module that transforms simple initial probability distributions into more flexible and expressive distributions through invertible mappings. This enables better approximation of complex prompt distributions and facilitates diverse prompt sampling for comprehensive intra-primitive modeling.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Bayesian-induced framework for learning probability distributions over primitive textual prompts

The authors introduce a Bayesian framework that models attribute and object textual prompts as probability distributions rather than single deterministic prompts. This approach captures intra-primitive diversity and semantic uncertainty, reducing overfitting to seen compositions and improving generalization to unseen attribute-object combinations.

Contribution

Compositional Distribution Synthesis mechanism

The authors propose a mechanism that combines learned probability distributions from attribute and object branches into a unified compositional prompt space. This captures semantic relationships between primitive concepts and their compositions, addressing the limitation of treating prompts independently.

Contribution

Three-path Distribution Enhancement module

The authors develop a module that transforms simple initial probability distributions into more flexible and expressive distributions through invertible mappings. This enables better approximation of complex prompt distributions and facilitates diverse prompt sampling for comprehensive intra-primitive modeling.