Flow-Disentangled Feature Importance

ICLR 2026 Conference SubmissionAnonymous Authors
InterpretabilityFeature ImportanceStatistical InferenceCorrelation DistortionUncertainty Quantification
Abstract:

Quantifying feature importance with valid statistical uncertainty is central to interpretable machine learning, yet classical model-agnostic methods often fail under feature correlation, producing unreliable attributions and compromising inference. Statistical approaches that address correlation through feature decorrelation have shown promise but remain restricted to 2\ell_2 loss, limiting their applicability across diverse machine learning tasks. We introduce Flow-Disentangled Feature Importance (FDFI), a model-agnostic framework that resolves these limitations by combining principled statistical inference with computational flexibility. FDFI leverages flow matching to learn flexible disentanglement maps that not only handle arbitrary feature distributions but also provide an interpretable pathway for understanding how importance is attributed through the data's correlation structure. The framework generalizes the decorrelation-based attribution to general differentiable loss functions, enabling statistically valid importance assessment for black-box predictors across regression and classification. We establish statistical inference theory, deriving semiparametric efficiency of FDFI estimators, which enables valid confidence intervals and hypothesis testing with Type I error control. Experiments demonstrate that FDFI achieves substantially higher statistical power than removal-based and conditional permutation approaches, while maintaining robust and interpretable attributions even under severe interdependence. These findings hold across synthetic benchmarks and a broad collection of real datasets spanning diverse domains.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Flow-Disentangled Feature Importance (FDFI), a framework combining flow matching with statistical inference to quantify feature importance under correlation. It resides in the 'Optimal Transport and Flow-Based Disentanglement' leaf, which contains only two papers total—the original work and one sibling (Disentangled Feature Importance). This represents a sparse, emerging research direction within the broader decorrelation-based importance literature, suggesting the approach occupies relatively unexplored methodological territory compared to more crowded branches like permutation-based methods or random forest-specific techniques.

The taxonomy reveals that FDFI's parent branch ('Decorrelation and Disentanglement-Based Importance') sits alongside permutation-based methods, model-agnostic inference frameworks, and random forest-specific approaches. The sibling leaf ('Local Weighting and Decorrelation Schemes') contains methods applying simpler decorrelation transformations, while neighboring branches like 'LOCO and Shapley-Based Importance' and 'Hypothesis Testing and Significance Tests' pursue statistical inference through different mechanisms. FDFI bridges optimal transport theory with model-agnostic inference, diverging from permutation schemes that modify sampling strategies and from local weighting methods that avoid explicit transport-based disentanglement.

Among twenty-five candidates examined, the FDFI framework contribution shows no clear refutation across nine candidates, while the semiparametric efficiency theory encountered one refutable candidate among ten examined, and the generalization to general loss functions found no refutations across six candidates. The limited search scope—top-K semantic matches plus citation expansion—means these statistics reflect a focused sample rather than exhaustive coverage. The framework and loss generalization contributions appear more novel within this sample, while the efficiency theory overlaps with at least one prior work among the candidates reviewed.

Based on the limited search scope of twenty-five candidates, FDFI appears to occupy a methodologically distinct position combining flow-based disentanglement with statistical inference. The sparse taxonomy leaf and low refutation rates suggest novelty, though the analysis does not cover the full literature landscape. The semiparametric efficiency component shows some prior overlap, warranting closer examination of theoretical claims against existing inference frameworks.

Taxonomy

Core-task Taxonomy Papers
50
3
Claimed Contributions
25
Contribution Candidate Papers Compared
1
Refutable Paper

Research Landscape Overview

Core task: quantifying feature importance under correlated predictors with statistical inference. The field addresses a fundamental challenge in machine learning and statistics—how to reliably measure which features matter when predictors are interdependent. The taxonomy reveals several complementary strategies: permutation-based methods that shuffle feature values to assess impact, decorrelation and disentanglement approaches that attempt to isolate individual feature contributions by breaking or modeling dependencies, random forest-specific techniques that exploit ensemble properties, model-agnostic frameworks offering broader applicability, generative and adversarial methods that learn conditional distributions, variance-based sensitivity analyses rooted in global uncertainty quantification, general explainability tools, variable selection procedures, and domain-specific case studies. Early work like Conditional Variable Importance[8] and methods such as Correlation Random Forests[1] illustrate how researchers have long grappled with correlation-induced biases, while newer efforts like Conditional Adversarial Forests[2] and Disentangled Feature Importance[9] push toward more sophisticated conditional modeling. A particularly active line of inquiry centers on decorrelation and disentanglement strategies, where the goal is to separate the unique contribution of each feature from confounding correlations. Flow Disentangled[0] sits squarely in this branch, employing optimal transport and flow-based techniques to disentangle feature effects—an approach closely related to Disentangled Feature Importance[9], which also targets the isolation of individual feature signals. These methods contrast with permutation-based alternatives like Cross Validated Permutation[3] or Corrected Permutation Importance[18], which address correlation by modifying the permutation scheme rather than explicitly modeling dependencies. Meanwhile, works such as Decorrelated Local Weighting[7] and Decorrelated Variable Importance[49] explore local or global decorrelation transformations. The central tension across these branches involves trade-offs between computational complexity, model assumptions, and the interpretability of resulting importance scores, with Flow Disentangled[0] emphasizing rigorous disentanglement via transport-based flows to achieve more faithful attributions in high-correlation regimes.

Claimed Contributions

Flow-Disentangled Feature Importance (FDFI) framework

The authors propose FDFI, a model-agnostic framework that extends disentangled feature importance to general differentiable loss functions by replacing Gaussian optimal transport with flexible flow matching. This enables statistically valid importance assessment for black-box predictors across regression and classification tasks while handling arbitrary feature distributions.

9 retrieved papers
Semiparametric efficiency theory for FDFI estimators

The authors derive the efficient influence functions and prove asymptotic normality of FDFI estimators for both latent and original feature importance scores. This theoretical foundation provides a principled basis for constructing confidence intervals and performing hypothesis testing with valid statistical inference.

10 retrieved papers
Can Refute
Generalization of feature importance measures to general loss functions

The authors analyze LOCO, CPI, and SCPI under general differentiable loss functions, establishing their formal equivalence under squared-error loss and identifying their shared vulnerability to correlation distortion. They then extend the DFI framework beyond squared-error loss to arbitrary differentiable losses, broadening applicability to classification and other tasks.

6 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Flow-Disentangled Feature Importance (FDFI) framework

The authors propose FDFI, a model-agnostic framework that extends disentangled feature importance to general differentiable loss functions by replacing Gaussian optimal transport with flexible flow matching. This enables statistically valid importance assessment for black-box predictors across regression and classification tasks while handling arbitrary feature distributions.

Contribution

Semiparametric efficiency theory for FDFI estimators

The authors derive the efficient influence functions and prove asymptotic normality of FDFI estimators for both latent and original feature importance scores. This theoretical foundation provides a principled basis for constructing confidence intervals and performing hypothesis testing with valid statistical inference.

Contribution

Generalization of feature importance measures to general loss functions

The authors analyze LOCO, CPI, and SCPI under general differentiable loss functions, establishing their formal equivalence under squared-error loss and identifying their shared vulnerability to correlation distortion. They then extend the DFI framework beyond squared-error loss to arbitrary differentiable losses, broadening applicability to classification and other tasks.