Answering Counterfactual Queries on Graph Databases

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

Counterfactual AnalysisGraph Database

Counterfactual analysis on graph data is central to causal reasoning and interpretability, yet existing graph-based methods rely on ad hoc perturbations and remain tied to model behavior rather than underlying data. To address this challenge, we introduce Counterfactual Graph Database (CF-GDB) queries, the first query-based framework for counterfactual reasoning on graphs that grounds counterfactuals in verifiable database instances. Our approach abstracts graphs into semantically meaningful concepts and compares them using a hypergraph-based distance that integrates local structure with global semantics. To ensure efficiency and scalability, we propose two complementary indices: the Concept Distribution Index (CDI), a histogram that provides certified lower bounds, and the Concept Semantic Index (CSI), a continuous embedding that provides upper bounds. These indices yield provably tight sandwich guarantees and enable efficient candidate pruning while preserving the fidelity of counterfactual retrieval. Using 8 read data sets across 4 domains, CF-GDB improves accuracy by over 20% and achieves up to 20× faster performance, demonstrating both fidelity and scalability.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces CF-GDB, a query-based framework for counterfactual reasoning on graph databases that grounds counterfactuals in verifiable database instances rather than model perturbations. It resides in the 'Counterfactual Query Systems and Database Integration' leaf, which contains only five papers total (including this one). This is a relatively sparse research direction within the broader taxonomy of 50 papers, suggesting that database-centric counterfactual query systems remain an emerging area compared to the more crowded GNN explainability branches.

The taxonomy reveals that most counterfactual graph research concentrates on GNN explanation methods (12 papers across instance-level and global explainers) and fairness applications (5 papers). The original paper's leaf sits alongside work on what-if databases and counterfactual visualization frameworks, but diverges from neighboring branches focused on causal discovery, LLM-based causal reasoning, and application-specific methods in recommendation or knowledge graphs. The scope_note emphasizes integration with query languages and database systems, distinguishing this work from theoretical causal frameworks and GNN-centric explanation techniques that dominate sibling categories.

Among 17 candidates examined across three contributions, no refutable prior work was identified. The CF-GDB framework examined 10 candidates with zero refutations, the C2GQ method examined 2 candidates with zero refutations, and the dual indexing scheme examined 5 candidates with zero refutations. This suggests that within the limited search scope—focused on top-K semantic matches and citation expansion—the specific combination of concept-based abstraction, hypergraph distance, and dual indexing for counterfactual graph queries appears distinct from examined prior work.

Based on the limited literature search of 17 candidates, the work appears to occupy a relatively novel position at the intersection of counterfactual reasoning and database query systems. However, the sparse population of the target leaf (5 papers) and the modest search scope mean this assessment reflects only the examined neighborhood, not an exhaustive survey of all potentially relevant database, graph query, or counterfactual reasoning literature.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: counterfactual reasoning on graph databases. The field encompasses a diverse set of approaches that apply counterfactual and causal thinking to graph-structured data. At the highest level, the taxonomy organizes work into several major branches: methods for explaining graph neural network predictions via counterfactual examples (e.g., Gcfexplainer[4], CF-GNNExplainer[8]), techniques addressing fairness and bias mitigation through counterfactual notions (e.g., Authentic Counterfactuals Fairness[10], Counterfactual Fairness Representation[18]), frameworks for counterfactual learning and data augmentation on graphs (e.g., Counterfactual Learning Graphs Survey[1], GraphCA[27]), general causal inference and reasoning systems (e.g., Causal Inference[3], Deep Causal Graphs[26]), integration of counterfactual queries with database systems (e.g., What If Databases[11], Counterfactual Graph Queries[0]), and application-specific methods spanning domains such as recommendation, anomaly detection, and knowledge graphs (e.g., Causal Knowledge Graph Recommendation[34], Counterfactual Anomaly Detection[14]). A particularly active line of work focuses on explainability for GNNs, where researchers seek minimal graph edits that flip model predictions, balancing interpretability with robustness (Robust Counterfactual GNN[5], Global Counterfactual GNN[6]). In contrast, the database integration branch explores how counterfactual queries can be formulated and executed over structured graph repositories, enabling users to ask "what if" questions directly within query languages. Counterfactual Graph Queries[0] sits squarely in this latter branch, alongside What If Databases[11] and Counterfactual Visualization Framework[13], emphasizing the operational and system-level challenges of embedding counterfactual reasoning into database workflows. Compared to neighbors like SIERRA[23] or Causal Hyper[20], which lean toward causal discovery or hypergraph reasoning, the original paper prioritizes query expressiveness and integration with existing database infrastructure, reflecting a systems-oriented perspective on counterfactual reasoning rather than purely algorithmic or fairness-driven concerns.

Claimed Contributions

Counterfactual Graph Database (CF-GDB) framework

10 retrieved papers

The authors propose CF-GDB, a novel framework that reframes counterfactual reasoning as a query problem over graph databases. Unlike prior approaches that generate perturbed graphs to flip model predictions, CF-GDB retrieves dataset-grounded, domain-valid counterfactuals anchored in verifiable instances.

10 retrieved papers

Concept-Based Counterfactual Graph Query (C2GQ) method

2 retrieved papers

The authors introduce C2GQ, which abstracts graphs into semantically meaningful concepts serving as prototypes that cluster structurally similar subgraphs. Differences are measured using a hypergraph-based concept distance grounded in unbalanced optimal transport, jointly capturing fine-grained local changes and global distributional shifts.

2 retrieved papers

Dual indexing scheme with certified bounds

5 retrieved papers

The authors propose two complementary indices for scalable counterfactual queries: CDI provides certified lower bounds via histogram-based concept counts, while CSI provides upper bounds through continuous embeddings. These indices yield provably tight sandwich guarantees and enable efficient candidate pruning while preserving retrieval fidelity.

5 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[11] What If: Causal Analysis with Graph Databases PDF

Amedeo Pachera, Mattia Palmiotto, Angela Bonifati, Andrea Mauri (2024)

[13] A framework to improve causal inferences from visualizations using counterfactual operators PDF

Arran Zeyu Wang, D. Borland, David Gotz, David Borland (2025)

[20] Causal what-if and how-to analysis using hyper PDF

Fangzhu Shen, Kayvon Heravi, Oscar Gomez, Ãscar GÃ³mez, Sainyam Galhotra, Amir Gilad, Sudeepa Roy, Babak Salimi (2023)

[23] SIERRA: A Counterfactual Thinking-based Visual Interface for Property Graph Query Construction PDF

Jiebing Ma, Sourav S. Bhowmick, Lester Tay, S. Bhowmick, Byron Choi (2024)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Counterfactual Graph Database (CF-GDB) framework

[11] What If: Causal Analysis with Graph Databases PDF

Cannot Refute

[35] GRETEL: Graph Counterfactual Explanation Evaluation Framework PDF

Cannot Refute

[53] AGENTICTS:Robust Text-to-SPARQL via Agentic Collaborative Reasoning over Heterogeneous Knowledge Graphs for the Circular Economy PDF

Cannot Refute

[54] Counterfactual fairness with partially known causal graph PDF

Cannot Refute

[55] User-friendly, interactive, and configurable explanations for graph neural networks with graph views PDF

Cannot Refute

[56] Leveraging structured biological knowledge for counterfactual inference: a case study of viral pathogenesis PDF

Cannot Refute

[57] PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis PDF

Cannot Refute

[58] Counterfactual-Based Root Cause Analysis for Misconfigurations in Autonomous Driving Systems PDF

Cannot Refute

[59] Actual causality canvas: a general framework for explanation-based socio-technical constructs PDF

Cannot Refute

[60] Design of an Automated Construction Platform for Advanced Mathematics Content Integrating Knowledge Graph and Generative Artificial Intelligence PDF

Cannot Refute

Contribution

Concept-Based Counterfactual Graph Query (C2GQ) method

[51] One Concept at a Time: Subspace-Constrained Causal Inference for High-Dimensional Treatments PDF

Cannot Refute

[52] Conceptual Graph Counterfactuals PDF

Cannot Refute

Contribution

Dual indexing scheme with certified bounds

[61] Efficient exact subgraph matching via gnn-based path dominance embedding PDF

Cannot Refute

[62] Efficient Exact Subgraph Matching via GNN-based Path Dominance Embedding (Technical Report) PDF

Cannot Refute

[63] Graph homomorphism revisited for graph matching PDF

Cannot Refute

[64] Efficient frequent subtree mining beyond forests PDF

Cannot Refute

[65] HFGNN: Efficient Graph Neural Networks Using Hub-Fringe Structures PDF

Cannot Refute

Answering Counterfactual Queries on Graph Databases

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[11] What If: Causal Analysis with Graph Databases PDF

[13] A framework to improve causal inferences from visualizations using counterfactual operators PDF

[20] Causal what-if and how-to analysis using hyper PDF

[23] SIERRA: A Counterfactual Thinking-based Visual Interface for Property Graph Query Construction PDF

Contribution Analysis

Counterfactual Graph Database (CF-GDB) framework

[11] What If: Causal Analysis with Graph Databases PDF

[35] GRETEL: Graph Counterfactual Explanation Evaluation Framework PDF

[53] AGENTICTS:Robust Text-to-SPARQL via Agentic Collaborative Reasoning over Heterogeneous Knowledge Graphs for the Circular Economy PDF

[54] Counterfactual fairness with partially known causal graph PDF

[55] User-friendly, interactive, and configurable explanations for graph neural networks with graph views PDF

[56] Leveraging structured biological knowledge for counterfactual inference: a case study of viral pathogenesis PDF

[57] PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis PDF

[58] Counterfactual-Based Root Cause Analysis for Misconfigurations in Autonomous Driving Systems PDF

[59] Actual causality canvas: a general framework for explanation-based socio-technical constructs PDF

[60] Design of an Automated Construction Platform for Advanced Mathematics Content Integrating Knowledge Graph and Generative Artificial Intelligence PDF

Concept-Based Counterfactual Graph Query (C2GQ) method

[51] One Concept at a Time: Subspace-Constrained Causal Inference for High-Dimensional Treatments PDF

[52] Conceptual Graph Counterfactuals PDF

Dual indexing scheme with certified bounds

[61] Efficient exact subgraph matching via gnn-based path dominance embedding PDF

[62] Efficient Exact Subgraph Matching via GNN-based Path Dominance Embedding (Technical Report) PDF

[63] Graph homomorphism revisited for graph matching PDF

[64] Efficient frequent subtree mining beyond forests PDF

[65] HFGNN: Efficient Graph Neural Networks Using Hub-Fringe Structures PDF

Table of Contents