GNN Explanations that do not Explain and How to find Them
Overview
Overall Novelty Assessment
The paper identifies a critical failure mode in self-explainable GNN explanations—namely, that explanations can be entirely unrelated to the model's actual inference process—and proposes a novel faithfulness metric (EST) to detect such degenerate cases. It resides in the Faithfulness Metric Development leaf alongside two sibling papers: one evaluating explainability for graph neural networks and another assessing attribution methods. This leaf contains only three papers total, suggesting a relatively sparse but focused research direction within the broader faithfulness evaluation landscape.
The taxonomy reveals that faithfulness evaluation comprises three distinct leaves: metric development, comparative evaluation studies, and ground-truth benchmark design. The paper's focus on developing a new metric positions it within the first category, while its empirical analysis of existing metrics' failures connects to comparative evaluation work. Neighboring branches include self-explainable GNN architectures and post-hoc explanation methods, with the paper's critical stance toward self-explainable models bridging these areas. The taxonomy's scope notes clarify that this work differs from empirical benchmarking studies by proposing a novel metric rather than merely comparing existing approaches.
Among eighteen candidates examined across three contributions, none were found to clearly refute the paper's claims. The first contribution (identifying the failure case) examined five candidates with zero refutations; the second (EST metric) examined three with zero refutations; the third (benchmark design) examined ten with zero refutations. This suggests that within the limited search scope—focused on top semantic matches and citation expansion—the specific combination of detecting degenerate explanations and proposing EST appears relatively unexplored. The benchmark contribution examined the largest candidate pool, yet still found no overlapping prior work.
Based on the limited literature search of eighteen candidates, the work appears to occupy a distinct position within faithfulness evaluation. The taxonomy structure indicates this is an active area with critical examination of self-explainable models, yet the specific focus on degenerate explanations and the EST metric shows no clear precedent among examined papers. However, the search scope does not cover the entire field, and the sparse leaf population suggests this direction may benefit from broader contextualization as the area develops.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors identify and characterize a fundamental failure mode where self-explainable GNNs can produce explanations that are completely unrelated to the model's actual decision-making process, despite achieving optimal predictive performance. They provide theoretical conditions under which this occurs and demonstrate it empirically.
The authors propose the Extension Sufficiency Test (EST), a new metric for evaluating explanation faithfulness that holistically considers all supergraphs of an explanation. EST is shown to be more robust than existing metrics at detecting unfaithful explanations in both malicious and natural settings.
The authors introduce a controlled benchmark that evaluates faithfulness metrics based on their ability to reject known-unfaithful explanations, using manipulated SE-GNNs that are trained to output degenerate explanations while maintaining high accuracy.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[1] Evaluating explainability for graph neural networks PDF
[10] Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
Identification of critical failure case in SE-GNN explanations
The authors identify and characterize a fundamental failure mode where self-explainable GNNs can produce explanations that are completely unrelated to the model's actual decision-making process, despite achieving optimal predictive performance. They provide theoretical conditions under which this occurs and demonstrate it empirically.
[25] Towards multi-grained explainability for graph neural networks PDF
[30] Graph-guided textual explanation generation framework PDF
[31] Conversational Graph-LLM Reasoning for Interactive Preference Modeling and Explainable Recommendation PDF
[32] TopInG: Topologically Interpretable Graph Learning via Persistent Rationale Filtration PDF
[33] Adversarial cooperative rationalization: The risk of spurious correlations in even clean datasets PDF
Novel faithfulness metric EST
The authors propose the Extension Sufficiency Test (EST), a new metric for evaluating explanation faithfulness that holistically considers all supergraphs of an explanation. EST is shown to be more robust than existing metrics at detecting unfaithful explanations in both malicious and natural settings.
[10] Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs PDF
[34] On glocal explainability of graph neural networks PDF
[35] Predicting polyester Performance of powder coating material using 3D graph network PDF
Benchmark for evaluating faithfulness metrics
The authors introduce a controlled benchmark that evaluates faithfulness metrics based on their ability to reject known-unfaithful explanations, using manipulated SE-GNNs that are trained to output degenerate explanations while maintaining high accuracy.