Unifying Complexity-Theoretic Perspectives on Provable Explanations

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.0 Download Report PDF

explainabilityinterpretability

Previous work has explored the computational complexity of deriving two fundamental types of explanations for ML model predictions: (1) sufficient reasons, which are subsets of input features that, when fixed, determine a prediction, and (2) contrastive reasons, which are subsets of input features that, when modified, alter a prediction. Prior studies have examined these explanations in different contexts, such as non-probabilistic versus probabilistic frameworks and local versus global settings. In this study, we introduce a unified framework for analyzing these explanations, demonstrating that they can all be characterized through the minimization of a unified probabilistic value function. We then prove that the complexity of these computations is influenced by three key properties of the value function: (1) monotonicity, (2) submodularity, and supermodularity. Our findings uncover some counterintuitive results regarding the nature of these properties within the explanation settings examined. For instance, although the local value functions do not exhibit monotonicity or submodularity/supermodularity whatsoever, we demonstrate that the global value functions do possess these properties. This distinction enables us to prove a series of novel polynomial-time results for computing various explanations with provable guarantees in the global explainability setting, across a range of ML models that span the interpretability spectrum, such as neural networks, decision trees, and tree ensembles. In contrast, we show that even highly simplified versions of these explanations become NP-hard to compute in the corresponding local explainability setting.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a unified probabilistic framework for analyzing both sufficient and contrastive explanations across local and global settings, characterizing them through value function minimization. It resides in the 'Unified Frameworks for Sufficient and Contrastive Explanations' leaf, which contains only two papers total. This represents a relatively sparse research direction within the broader taxonomy of twenty papers, suggesting the work addresses a gap in theoretical unification that few prior studies have tackled systematically.

The taxonomy reveals that most related work splits into separate branches: type-specific complexity analysis for contrastive explanations, hardness results for in-distribution interpretation, and efficient algorithmic approaches. The paper's leaf sits under 'Theoretical Complexity Analysis of Explanation Computation,' adjacent to leaves examining contrastive-specific complexity and interpretation hardness. By bridging sufficient and contrastive paradigms rather than treating them separately, this work diverges from the field's tendency toward specialized analysis, positioning itself at the intersection of multiple established directions.

Among twenty-one candidates examined, no contribution was clearly refuted by prior work. The unified framework contribution examined one candidate with no refutation. The structural properties distinguishing global from local value functions examined ten candidates, none refuting the claims. The polynomial-time algorithms and approximation guarantees also examined ten candidates without refutation. This suggests that within the limited search scope, the specific combination of unified analysis, structural property discovery, and algorithmic results appears relatively unexplored, though the small candidate pool limits definitive conclusions about field-wide novelty.

Based on top-twenty-one semantic matches, the work appears to occupy a distinct position combining theoretical unification with algorithmic contributions. The sparse population of its taxonomy leaf and absence of refuting candidates suggest novelty, though the limited search scope means potentially relevant work outside these candidates remains unexamined. The analysis covers semantic neighbors and citation-expanded papers but does not constitute an exhaustive literature review.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: understanding the computational complexity of deriving sufficient and contrastive explanations for machine learning models. The field organizes around four main branches. Theoretical Complexity Analysis examines the fundamental hardness of computing explanations, establishing lower bounds and complexity classes for various explanation types. Efficient Algorithms for Explanation Generation develops practical methods that balance computational cost with explanation quality, often leveraging heuristics or approximations. Application-Driven Explanation Methods tailor explanation techniques to specific domains such as healthcare, recommendations, or reinforcement learning, where domain constraints shape both the form and feasibility of explanations. Foundational Concepts and Human-Centered Explanation addresses the conceptual underpinnings—what makes an explanation sufficient or contrastive—and how these formal notions align with human interpretability needs. Works like Contrastive Explanation[1] and Necessary Sufficient Explanations[6] illustrate how different branches intersect, combining formal definitions with algorithmic considerations. A particularly active line of work explores the tension between expressiveness and tractability in explanation generation. Some studies focus on contrastive methods that highlight minimal differences between predictions, as seen in Efficient Contrastive Explanations[4] and Minimally Contrastive Counterfactuals[7], while others emphasize sufficient explanations that capture all relevant features. Unifying Provable Explanations[0] sits within the Theoretical Complexity Analysis branch, specifically addressing unified frameworks that bridge sufficient and contrastive notions. Its emphasis on provable guarantees and complexity characterization contrasts with more heuristic approaches in the algorithmic branch, yet complements neighboring work like Necessary Sufficient Explanations[6], which also examines the interplay between these two explanation paradigms. Recent efforts such as Hard to Explain[9] and Feature Relevancy Complexity[18] further probe the inherent difficulty of explanation tasks, underscoring ongoing questions about when explanations can be computed efficiently and when fundamental barriers arise.

Claimed Contributions

Unified framework for analyzing explanations via value function minimization

1 retrieved paper

The authors propose a unified framework that characterizes various explanation types (sufficient, contrastive, probabilistic, non-probabilistic, local, and global) as minimization tasks over a single probabilistic value function. They identify three key properties of this function—monotonicity, submodularity, and supermodularity—that influence computational complexity.

1 retrieved paper

Discovery of structural properties distinguishing global from local value functions

10 retrieved papers

The authors prove that global value functions exhibit monotonicity, submodularity (for contrastive reasons), and supermodularity (for sufficient reasons), whereas local value functions lack these properties. This structural difference enables new tractability results for global explanations.

10 retrieved papers

Novel polynomial-time algorithms and approximation guarantees for global explanations

10 retrieved papers

The authors establish that subset-minimal global explanations can be computed in polynomial time for models including neural networks, decision trees, and tree ensembles under empirical distributions. They also provide constant-factor approximation algorithms for cardinally minimal global explanations, contrasting sharply with NP-hardness and inapproximability results in the local setting.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[6] On the Computation of Necessary and Sufficient Explanations PDF

Darwiche, Adnan, Ji, Chunxi (2022)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Unified framework for analyzing explanations via value function minimization

[31] A Hybrid Optimization-Based Explainable Artificial Intelligence Approach for Classifying Defense Spending and Economic Categories PDF

Cannot Refute

Contribution

Discovery of structural properties distinguishing global from local value functions

[21] A survey on neural network interpretability PDF

Cannot Refute

[22] Explainable AI for Trees: From Local Explanations to Global Understanding PDF

Cannot Refute

[23] Enhancing trust and interpretability of complex machine learning models using local interpretable model agnostic shap explanations PDF

Cannot Refute

[24] Attribution-based interpretable classification neural network with global and local perspectives: Z. Shi et al. PDF

Cannot Refute

[25] From local explanations to global understanding with explainable AI for trees PDF

Cannot Refute

[26] Explaining deep neural networks: A survey on the global interpretation methods PDF

Cannot Refute

[27] Interpretable convolutional neural networks with dual local and global attention for review rating prediction PDF

Cannot Refute

[28] Attribution-based interpretable classification neural network with global and local perspectives PDF

Cannot Refute

[29] Local vs. global interpretability: A computational complexity perspective PDF

Cannot Refute

[30] Exact Shapley Attributions in Quadratic-time for FANOVA Gaussian Processes PDF

Cannot Refute

Contribution

Novel polynomial-time algorithms and approximation guarantees for global explanations

[22] Explainable AI for Trees: From Local Explanations to Global Understanding PDF

Cannot Refute

[32] Interpretable and Efficient Heterogeneous Graph Convolutional Network PDF

Cannot Refute

[33] Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks PDF

Cannot Refute

[34] Efficient and Interpretable Bandit Algorithms PDF

Cannot Refute

[35] Tractable Explanations for d-DNNF Classifiers PDF

Cannot Refute

[36] Unlocking the Game: Estimating Games in MÃ¶bius Representation for Explanation and High-Order Interaction Detection PDF

Cannot Refute

[37] Efficiently Computing Compact Formal Explanations PDF

Cannot Refute

[38] AILA--First Experiments with Localist Language Models PDF

Cannot Refute

[39] Low-rank tensor decompositions for the theory of neural networks PDF

Cannot Refute

[40] Constraint-driven explanations for black-box ml models PDF

Cannot Refute

Unifying Complexity-Theoretic Perspectives on Provable Explanations

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[6] On the Computation of Necessary and Sufficient Explanations PDF

Contribution Analysis

Unified framework for analyzing explanations via value function minimization

[31] A Hybrid Optimization-Based Explainable Artificial Intelligence Approach for Classifying Defense Spending and Economic Categories PDF

Discovery of structural properties distinguishing global from local value functions

[21] A survey on neural network interpretability PDF

[22] Explainable AI for Trees: From Local Explanations to Global Understanding PDF

[23] Enhancing trust and interpretability of complex machine learning models using local interpretable model agnostic shap explanations PDF

[24] Attribution-based interpretable classification neural network with global and local perspectives: Z. Shi et al. PDF

[25] From local explanations to global understanding with explainable AI for trees PDF

[26] Explaining deep neural networks: A survey on the global interpretation methods PDF

[27] Interpretable convolutional neural networks with dual local and global attention for review rating prediction PDF

[28] Attribution-based interpretable classification neural network with global and local perspectives PDF

[29] Local vs. global interpretability: A computational complexity perspective PDF

[30] Exact Shapley Attributions in Quadratic-time for FANOVA Gaussian Processes PDF

Novel polynomial-time algorithms and approximation guarantees for global explanations

[22] Explainable AI for Trees: From Local Explanations to Global Understanding PDF

[32] Interpretable and Efficient Heterogeneous Graph Convolutional Network PDF

[33] Towards Formal XAI: Formally Approximate Minimal Explanations of Neural Networks PDF

[34] Efficient and Interpretable Bandit Algorithms PDF

[35] Tractable Explanations for d-DNNF Classifiers PDF

[36] Unlocking the Game: Estimating Games in MÃ¶bius Representation for Explanation and High-Order Interaction Detection PDF

[37] Efficiently Computing Compact Formal Explanations PDF

[38] AILA--First Experiments with Localist Language Models PDF

[39] Low-rank tensor decompositions for the theory of neural networks PDF

[40] Constraint-driven explanations for black-box ml models PDF

Table of Contents