Flow-Disentangled Feature Importance
Project description
FDFI - Flow-Disentangled Feature Importance
A Python library for computing feature importance using disentangled methods, inspired by SHAP.
Current release: 0.0.7
Overview
FDFI (Flow-Disentangled Feature Importance) is a Python module that provides interpretable machine learning explanations through disentangled feature importance methods. This package implements both DFI (Disentangled Feature Importance) and FDFI (Flow-DFI) methods. Similar to SHAP, FDFI helps you understand which features are driving your model's predictions.
Features
- 🎯 Multiple Explainer Types: Tree, Linear, and Kernel explainers for different model types
- 🧭 OT-Based DFI: Gaussian OT (OTExplainer) and Entropic OT (EOTExplainer)
- 🌊 Flow-DFI: FlowExplainer with CPI and SCPI methods for non-Gaussian data
- 📊 Rich Visualizations: Summary, waterfall, force, and dependence plots
- 🔧 Easy to Use: Simple API similar to SHAP
- 🧪 Statistical Inference: Confidence intervals and multiple testing correction (FDR/FWER)
- 🚀 Extensible: Built with modularity in mind for future enhancements
Installation
From Source
git clone https://github.com/jaydu1/FDFI.git
cd FDFI
pip install -e .
Dependencies
Use pyproject.toml extras:
pip install -e ".[dev]"
pip install -e ".[flow]"
Quick Start
import numpy as np
from fdfi.explainers import OTExplainer
# Define your model
def model(X):
return X.sum(axis=1)
# Create background data
X_background = np.random.randn(100, 10)
# Create an explainer
explainer = OTExplainer(model, data=X_background, nsamples=50)
# Explain test instances
X_test = np.random.randn(10, 10)
results = explainer(X_test)
# Confidence intervals (post-hoc)
ci = explainer.conf_int(alpha=0.05, target="X", alternative="two-sided")
# With multiple testing correction (e.g., FDR control)
ci_fdr = explainer.conf_int(multitest_method="fdr_bh")
explainer.summary(multitest_method="fdr_bh")
Visualization
FDFI includes static Matplotlib plotting helpers for global scores, per-sample UEIFs, confidence intervals, diagnostics, and feature correlation.
from fdfi.plots import (
confidence_interval_plot,
correlation_heatmap,
diagnostics_plot,
summary_bar,
summary_plot,
)
feature_names = [f"X{i}" for i in range(X_background.shape[1])]
# Background correlation structure
correlation_heatmap(X_background, feature_names, show=False)
# Global scores and standard errors from explainer output
summary_bar(results["phi_X"], results["se_X"], feature_names, show=False)
# Per-sample UEIF distribution after running the explainer
summary_plot(explainer.ueifs_X, features=X_test, feature_names=feature_names, show=False)
# Inference and quality checks
confidence_interval_plot(ci, feature_names=feature_names, show=False)
diagnostics_plot(explainer.diagnostics, feature_names=feature_names, show=False)
CI Defaults in v0.0.2
By default, conf_int() now uses:
var_floor_method="mixture"margin_method="mixture"
This improves stability for weak effects and avoids ad hoc thresholding in many use cases. You can still override both methods explicitly if needed.
EOT Options (Entropic OT)
EOTExplainer supports adaptive epsilon, stochastic transport sampling, and
Gaussian/empirical targets:
from fdfi.explainers import EOTExplainer
explainer = EOTExplainer(
model.predict,
X_background,
auto_epsilon=True,
stochastic_transport=True,
n_transport_samples=10,
target="gaussian", # or "empirical"
)
results = explainer(X_test)
Flow-DFI with FlowExplainer
FlowExplainer uses normalizing flows for non-Gaussian data, supporting both CPI (Conditional Permutation Importance) and SCPI (Sobol-CPI):
- CPI: Average predictions first, then squared difference: $(Y - E[f(\tilde{X})])^2$
- SCPI: Squared differences first, then average: $E[(Y - f(\tilde{X}_b))^2]$
from fdfi.explainers import FlowExplainer
# Create explainer with CPI (default)
explainer = FlowExplainer(
model.predict,
X_background,
fit_flow=True,
method='cpi', # 'cpi', 'scpi', or 'both'
num_steps=200, # flow training steps
nsamples=50, # counterfactual samples
sampling_method='resample', # 'resample', 'permutation', 'normal', 'condperm'
)
results = explainer(X_test)
# results['phi_Z']: Z-space importance
# results['phi_X']: same as phi_Z (Z-space methods)
# Confidence intervals
ci = explainer.conf_int(alpha=0.05, target="Z", alternative="two-sided")
Explainer diagnostics (new in v0.0.2)
Disentangled explainers (OTExplainer, EOTExplainer, and FlowExplainer) report two diagnostics with qualitative labels (GOOD / MODERATE / POOR) using consistent [FDFI][DIAG] logging:
- Latent independence (median dCor) — lower is better (thresholds: <0.10 good, <0.25 moderate).
- Distribution fidelity (MMD) — lower is better (thresholds: <0.05 good, <0.15 moderate).
Example log:
[FDFI][DIAG] Flow Model Diagnostics
[FDFI][DIAG] Latent independence (median dCor): 0.0421 [GOOD] → lower is better
[FDFI][DIAG] Distribution fidelity (MMD): 0.0187 [GOOD] → lower is better
Access diagnostics directly:
diag = explainer.diagnostics
print(diag["latent_independence_median"], diag["latent_independence_label"])
print(diag["distribution_fidelity_mmd"], diag["distribution_fidelity_label"])
For advanced users, flow models can be trained separately:
from fdfi.models import FlowMatchingModel
# Train flow model externally
flow_model = FlowMatchingModel(X_background, dim=X_background.shape[1])
flow_model.fit(num_steps=500, verbose='final')
# Set pre-trained flow
explainer = FlowExplainer(model.predict, X_background, fit_flow=False)
explainer.set_flow(flow_model)
Project Structure
FDFI/
├── fdfi/ # Main package directory
│ ├── __init__.py # Package initialization
│ ├── explainers.py # Explainer classes
│ ├── plots.py # Visualization functions
│ └── utils.py # Utility functions
├── tests/ # Test suite
│ ├── test_explainers.py
│ ├── test_plots.py
│ └── test_utils.py
├── docs/ # Documentation & tutorials
│ └── tutorials/ # Jupyter notebook tutorials
├── pyproject.toml # Package configuration
└── README.md # This file
Development Status
FDFI is under active research development. The package includes implemented OT/EOT/Flow explainers, statistical inference helpers, diagnostics, plotting utilities, tests, and documentation. Some advanced modeling components continue to evolve as the methodology develops.
Testing
Run the test suite:
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=fdfi --cov-report=html
Documentation
Full documentation and tutorials are available in the docs/ directory:
- Quickstart Tutorial
- OT Explainer Tutorial
- EOT Explainer Tutorial
- Flow Explainer Tutorial
- Confidence Intervals
- Visualization Tutorial
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
References
FDFI is based on:
- Du, J.-H., Roeder, K., & Wasserman, L. (2025). Disentangled Feature Importance. arXiv preprint arXiv:2507.00260.
- Chen, X., Guo, Y., & Du, J.-H. (2026). Flow-Disentangled Feature Importance. In The Thirteenth International Conference on Learning Representations (ICLR).
Related work:
- SHAP: A game theoretic approach to explain machine learning models
Citation
If you use DFI in your research, please cite:
@software{dfi2026,
title={DFI: Python Library for Disentangled Feature Importance},
author={DFI Team},
year={2026},
url={https://github.com/jaydu1/FDFI}
}
@article{du2025disentangled,
title={Disentangled Feature Importance},
author={Du, Jin-Hong and Roeder, Kathryn and Wasserman, Larry},
journal={arXiv preprint arXiv:2507.00260},
year={2025}
}
@inproceedings{chen2026flow,
title={Flow-Disentangled Feature Importance},
author={Chen, Xin and Guo, Yifan and Du, Jin-Hong},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2026}
}
Contact
For questions and issues, please use the GitHub issue tracker.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fdfi-0.0.7.tar.gz.
File metadata
- Download URL: fdfi-0.0.7.tar.gz
- Upload date:
- Size: 60.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f94a692d4ef06c206a173f312a6eeef9f406d6d4e115e28bdb0f0d9687e20485
|
|
| MD5 |
3526f7a8dbfe0c767666e5291a45bb5f
|
|
| BLAKE2b-256 |
d625ce2d290fbbb5f7648d32c2c74f7cac4e445e52161b41342cefce6799bea0
|
Provenance
The following attestation bundles were made for fdfi-0.0.7.tar.gz:
Publisher:
publish.yml on jaydu1/FDFI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fdfi-0.0.7.tar.gz -
Subject digest:
f94a692d4ef06c206a173f312a6eeef9f406d6d4e115e28bdb0f0d9687e20485 - Sigstore transparency entry: 1633586695
- Sigstore integration time:
-
Permalink:
jaydu1/FDFI@f9f4e85a72eb23b226faa3d1d099dbe4eca07eb8 -
Branch / Tag:
refs/tags/0.0.7 - Owner: https://github.com/jaydu1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9f4e85a72eb23b226faa3d1d099dbe4eca07eb8 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fdfi-0.0.7-py3-none-any.whl.
File metadata
- Download URL: fdfi-0.0.7-py3-none-any.whl
- Upload date:
- Size: 44.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71d97613d88aecb87c7d31e79fbe88f1f6ce669649197ba83e96e4cc1497c4ab
|
|
| MD5 |
644b83a71250026d9db625ab96fc43bb
|
|
| BLAKE2b-256 |
cd9e653f3830cbe4ddc55f9f814b114901e3f12a9af65d59d6ef3161458db820
|
Provenance
The following attestation bundles were made for fdfi-0.0.7-py3-none-any.whl:
Publisher:
publish.yml on jaydu1/FDFI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fdfi-0.0.7-py3-none-any.whl -
Subject digest:
71d97613d88aecb87c7d31e79fbe88f1f6ce669649197ba83e96e4cc1497c4ab - Sigstore transparency entry: 1633586732
- Sigstore integration time:
-
Permalink:
jaydu1/FDFI@f9f4e85a72eb23b226faa3d1d099dbe4eca07eb8 -
Branch / Tag:
refs/tags/0.0.7 - Owner: https://github.com/jaydu1
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9f4e85a72eb23b226faa3d1d099dbe4eca07eb8 -
Trigger Event:
release
-
Statement type: