Flock: A Knowledge Graph Foundation Model via Learning on Random Walks

ICLR 2026 Conference SubmissionAnonymous Authors
knowledge graphslink predictionknowledge graph foundation modelsinvarianceequivariancerandom walks
Abstract:

We study the problem of zero-shot link prediction on knowledge graphs (KGs), which requires models to generalize over novel entities and novel relations. Knowledge graph foundation models (KGFMs) address this task by enforcing equivariance over both nodes and relations, learning from structural properties of nodes and relations, which are then transferable to novel graphs with similar structural properties. However, the conventional notion of deterministic equivariance imposes inherent limits on the expressive power of KGFMs, preventing them from distinguishing structurally similar but semantically distinct relations. To overcome this limitation, we introduce probabilistic node-relation equivariance, which preserves equivariance in distribution while incorporating a principled randomization to break symmetries during inference. Building on this principle, we present Flock, a KGFM that iteratively samples random walks, encodes them into sequences via a recording protocol, embeds them with a sequence model, and aggregates representations of nodes and relations via learned pooling. Crucially, Flock respects probabilistic node-relation equivariance and is a universal approximator for isomorphism-invariant link-level functions over KGs. Empirically, Flock perfectly solves our new diagnostic dataset Petals where current KGFMs fail, and achieves state-of-the-art performances on entity- and relation prediction tasks on 54 KGs from diverse domains.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Flock, a knowledge graph foundation model addressing zero-shot link prediction through probabilistic node-relation equivariance. It resides in the 'Foundation Models and Universal Representations' leaf, which contains only four papers total, including this work. This represents a relatively sparse research direction within the broader taxonomy of fifty papers across thirty-six topics, suggesting the foundation model approach to knowledge graph reasoning remains an emerging area compared to more established branches like meta-learning or description-based methods.

The taxonomy reveals that neighboring research directions pursue alternative strategies for zero-shot generalization. The 'Zero-Shot Relational Learning Methods' branch explores GAN-based frameworks, textual descriptions, and LLM integration across twelve papers, while 'Few-Shot Link Prediction Methods' emphasizes meta-learning and subgraph reasoning with eleven papers. The 'Inductive Link Prediction Methods' branch focuses on GNN-based generalization to unseen entities. Flock's foundation model approach diverges by learning universal representations transferable across arbitrary graphs, rather than relation-specific semantic features or episodic adaptation protocols.

Among twelve candidates examined through limited semantic search, no contributions were clearly refuted by prior work. The probabilistic equivariance principle examined four candidates with zero refutations, the Flock architecture examined one candidate with zero refutations, and the PETALS diagnostic dataset examined seven candidates with zero refutations. This suggests that within the examined scope, the core technical innovations—particularly the probabilistic relaxation of deterministic equivariance and the random-walk-based encoding protocol—appear distinct from existing approaches in the foundation model space.

The analysis reflects a constrained literature search rather than exhaustive coverage. The sparse population of the foundation model leaf and absence of refutations among twelve candidates indicate potential novelty, though the limited search scope prevents definitive claims about uniqueness. The taxonomy structure suggests Flock occupies a less crowded niche compared to meta-learning or description-based methods, but comprehensive assessment would require broader examination of recent foundation model developments beyond the top-K semantic matches analyzed here.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
12
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: zero-shot link prediction on knowledge graphs. The field addresses the challenge of predicting missing links for relations or entities unseen during training, a setting that demands models to generalize beyond transductive scenarios. The taxonomy reveals several complementary research directions: Zero-Shot Relational Learning Methods and Few-Shot Link Prediction Methods focus on learning from limited or no examples of target relations, often employing meta-learning or metric-based approaches such as Meta-graph[19] and Few-shot Completion[15]. Foundation Models and Universal Representations explore leveraging large-scale pretrained models and language-based encodings to achieve broad generalization, as seen in Foundation Models Reasoning[12] and TRIX[2]. Inductive Link Prediction Methods tackle the problem of reasoning over entirely new entities, while Specialized Link Prediction Tasks and Logical Query Answering address domain-specific constraints and complex multi-hop reasoning. Surveys and Benchmarks, including Zero-shot Few-shot Survey[7] and Beyond Transduction Survey[11], provide structured overviews of progress and open challenges across these branches. Recent work has increasingly turned to foundation models and universal representations to bridge the gap between symbolic knowledge graphs and neural language understanding. A central tension lies in balancing the expressiveness of large language models with the structured reasoning capabilities of graph-based methods. Flock[0] situates itself within this Foundation Models branch, emphasizing universal representations that can generalize across diverse knowledge graph schemas without task-specific fine-tuning. This contrasts with approaches like TRIX[2], which also leverages foundation models but may focus more on retrieval-augmented strategies, and SEMMA[18], which explores semantic matching mechanisms. Meanwhile, Zero-shot Link Prediction LLMs[1] and zrllm[22] investigate how to directly prompt or adapt language models for zero-shot scenarios, highlighting ongoing debates about whether symbolic graph structure or textual context provides stronger inductive biases for unseen relations.

Claimed Contributions

Probabilistic node-relation equivariance principle

The authors introduce a relaxed notion of equivariance for knowledge graph foundation models that preserves equivariance in distribution rather than deterministically. This allows models to distinguish structurally similar but semantically distinct relations while maintaining the inductive bias needed for generalization across different knowledge graphs.

4 retrieved papers
FLOCK architecture and framework

The authors present FLOCK, a knowledge graph foundation model that iteratively samples random walks, encodes them into sequences via a recording protocol, embeds them with a sequence model, and aggregates representations via learned pooling. This architecture avoids message-passing entirely and is proven to be a universal approximator for isomorphism-invariant link-level functions over knowledge graphs.

1 retrieved paper
PETALS diagnostic dataset

The authors construct a synthetic benchmark dataset called PETALS specifically designed to test whether knowledge graph foundation models can distinguish structurally similar but semantically distinct relations. This dataset validates that FLOCK can solve cases where existing deterministic equivariant models fail.

7 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Probabilistic node-relation equivariance principle

The authors introduce a relaxed notion of equivariance for knowledge graph foundation models that preserves equivariance in distribution rather than deterministically. This allows models to distinguish structurally similar but semantically distinct relations while maintaining the inductive bias needed for generalization across different knowledge graphs.

Contribution

FLOCK architecture and framework

The authors present FLOCK, a knowledge graph foundation model that iteratively samples random walks, encodes them into sequences via a recording protocol, embeds them with a sequence model, and aggregates representations via learned pooling. This architecture avoids message-passing entirely and is proven to be a universal approximator for isomorphism-invariant link-level functions over knowledge graphs.

Contribution

PETALS diagnostic dataset

The authors construct a synthetic benchmark dataset called PETALS specifically designed to test whether knowledge graph foundation models can distinguish structurally similar but semantically distinct relations. This dataset validates that FLOCK can solve cases where existing deterministic equivariant models fail.