Unifying Complexity-Theoretic Perspectives on Provable Explanations

ICLR 2026 Conference SubmissionAnonymous Authors
explainabilityinterpretability
Abstract:

Previous work has explored the computational complexity of deriving two fundamental types of explanations for ML model predictions: (1) sufficient reasons, which are subsets of input features that, when fixed, determine a prediction, and (2) contrastive reasons, which are subsets of input features that, when modified, alter a prediction. Prior studies have examined these explanations in different contexts, such as non-probabilistic versus probabilistic frameworks and local versus global settings. In this study, we introduce a unified framework for analyzing these explanations, demonstrating that they can all be characterized through the minimization of a unified probabilistic value function. We then prove that the complexity of these computations is influenced by three key properties of the value function: (1) monotonicity, (2) submodularity, and supermodularity. Our findings uncover some counterintuitive results regarding the nature of these properties within the explanation settings examined. For instance, although the local value functions do not exhibit monotonicity or submodularity/supermodularity whatsoever, we demonstrate that the global value functions do possess these properties. This distinction enables us to prove a series of novel polynomial-time results for computing various explanations with provable guarantees in the global explainability setting, across a range of ML models that span the interpretability spectrum, such as neural networks, decision trees, and tree ensembles. In contrast, we show that even highly simplified versions of these explanations become NP-hard to compute in the corresponding local explainability setting.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper proposes a unified probabilistic framework for analyzing both sufficient and contrastive explanations across local and global settings, characterizing them through value function minimization. It resides in the 'Unified Frameworks for Sufficient and Contrastive Explanations' leaf, which contains only two papers total. This represents a relatively sparse research direction within the broader taxonomy of twenty papers, suggesting the work addresses a gap in theoretical unification that few prior studies have tackled systematically.

The taxonomy reveals that most related work splits into separate branches: type-specific complexity analysis for contrastive explanations, hardness results for in-distribution interpretation, and efficient algorithmic approaches. The paper's leaf sits under 'Theoretical Complexity Analysis of Explanation Computation,' adjacent to leaves examining contrastive-specific complexity and interpretation hardness. By bridging sufficient and contrastive paradigms rather than treating them separately, this work diverges from the field's tendency toward specialized analysis, positioning itself at the intersection of multiple established directions.

Among twenty-one candidates examined, no contribution was clearly refuted by prior work. The unified framework contribution examined one candidate with no refutation. The structural properties distinguishing global from local value functions examined ten candidates, none refuting the claims. The polynomial-time algorithms and approximation guarantees also examined ten candidates without refutation. This suggests that within the limited search scope, the specific combination of unified analysis, structural property discovery, and algorithmic results appears relatively unexplored, though the small candidate pool limits definitive conclusions about field-wide novelty.

Based on top-twenty-one semantic matches, the work appears to occupy a distinct position combining theoretical unification with algorithmic contributions. The sparse population of its taxonomy leaf and absence of refuting candidates suggest novelty, though the limited search scope means potentially relevant work outside these candidates remains unexamined. The analysis covers semantic neighbors and citation-expanded papers but does not constitute an exhaustive literature review.

Taxonomy

Core-task Taxonomy Papers
20
3
Claimed Contributions
21
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: understanding the computational complexity of deriving sufficient and contrastive explanations for machine learning models. The field organizes around four main branches. Theoretical Complexity Analysis examines the fundamental hardness of computing explanations, establishing lower bounds and complexity classes for various explanation types. Efficient Algorithms for Explanation Generation develops practical methods that balance computational cost with explanation quality, often leveraging heuristics or approximations. Application-Driven Explanation Methods tailor explanation techniques to specific domains such as healthcare, recommendations, or reinforcement learning, where domain constraints shape both the form and feasibility of explanations. Foundational Concepts and Human-Centered Explanation addresses the conceptual underpinnings—what makes an explanation sufficient or contrastive—and how these formal notions align with human interpretability needs. Works like Contrastive Explanation[1] and Necessary Sufficient Explanations[6] illustrate how different branches intersect, combining formal definitions with algorithmic considerations. A particularly active line of work explores the tension between expressiveness and tractability in explanation generation. Some studies focus on contrastive methods that highlight minimal differences between predictions, as seen in Efficient Contrastive Explanations[4] and Minimally Contrastive Counterfactuals[7], while others emphasize sufficient explanations that capture all relevant features. Unifying Provable Explanations[0] sits within the Theoretical Complexity Analysis branch, specifically addressing unified frameworks that bridge sufficient and contrastive notions. Its emphasis on provable guarantees and complexity characterization contrasts with more heuristic approaches in the algorithmic branch, yet complements neighboring work like Necessary Sufficient Explanations[6], which also examines the interplay between these two explanation paradigms. Recent efforts such as Hard to Explain[9] and Feature Relevancy Complexity[18] further probe the inherent difficulty of explanation tasks, underscoring ongoing questions about when explanations can be computed efficiently and when fundamental barriers arise.

Claimed Contributions

Unified framework for analyzing explanations via value function minimization

The authors propose a unified framework that characterizes various explanation types (sufficient, contrastive, probabilistic, non-probabilistic, local, and global) as minimization tasks over a single probabilistic value function. They identify three key properties of this function—monotonicity, submodularity, and supermodularity—that influence computational complexity.

1 retrieved paper
Discovery of structural properties distinguishing global from local value functions

The authors prove that global value functions exhibit monotonicity, submodularity (for contrastive reasons), and supermodularity (for sufficient reasons), whereas local value functions lack these properties. This structural difference enables new tractability results for global explanations.

10 retrieved papers
Novel polynomial-time algorithms and approximation guarantees for global explanations

The authors establish that subset-minimal global explanations can be computed in polynomial time for models including neural networks, decision trees, and tree ensembles under empirical distributions. They also provide constant-factor approximation algorithms for cardinally minimal global explanations, contrasting sharply with NP-hardness and inapproximability results in the local setting.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Unified framework for analyzing explanations via value function minimization

The authors propose a unified framework that characterizes various explanation types (sufficient, contrastive, probabilistic, non-probabilistic, local, and global) as minimization tasks over a single probabilistic value function. They identify three key properties of this function—monotonicity, submodularity, and supermodularity—that influence computational complexity.

Contribution

Discovery of structural properties distinguishing global from local value functions

The authors prove that global value functions exhibit monotonicity, submodularity (for contrastive reasons), and supermodularity (for sufficient reasons), whereas local value functions lack these properties. This structural difference enables new tractability results for global explanations.

Contribution

Novel polynomial-time algorithms and approximation guarantees for global explanations

The authors establish that subset-minimal global explanations can be computed in polynomial time for models including neural networks, decision trees, and tree ensembles under empirical distributions. They also provide constant-factor approximation algorithms for cardinally minimal global explanations, contrasting sharply with NP-hardness and inapproximability results in the local setting.