Skip to main content

Flow-Disentangled Feature Importance

Project description

FDFI - Flow-Disentangled Feature Importance

License: MIT Python 3.8+

A Python library for computing feature importance using disentangled methods, inspired by SHAP.

Overview

FDFI (Flow-Disentangled Feature Importance) is a Python module that provides interpretable machine learning explanations through disentangled feature importance methods. This package implements both DFI (Disentangled Feature Importance) and FDFI (Flow-DFI) methods. Similar to SHAP, FDFI helps you understand which features are driving your model's predictions.

Features

  • ๐ŸŽฏ Multiple Explainer Types: Tree, Linear, and Kernel explainers for different model types
  • ๐Ÿงญ OT-Based DFI: Gaussian OT (OTExplainer) and Entropic OT (EOTExplainer)
  • ๐ŸŒŠ Flow-DFI: FlowExplainer with CPI and SCPI methods for non-Gaussian data
  • ๐Ÿ“Š Rich Visualizations: Summary, waterfall, force, and dependence plots
  • ๐Ÿ”ง Easy to Use: Simple API similar to SHAP
  • ๐Ÿš€ Extensible: Built with modularity in mind for future enhancements

Installation

From Source

git clone https://github.com/jaydu1/FDFI.git
cd FDFI
pip install -e .

Dependencies

Use pyproject.toml extras:

pip install -e ".[dev]"
pip install -e ".[plots]"
pip install -e ".[flow]"

Quick Start

import numpy as np
from fdfi.explainers import OTExplainer

# Define your model
def model(X):
    return X.sum(axis=1)

# Create background data
X_background = np.random.randn(100, 10)

# Create an explainer
explainer = OTExplainer(model, data=X_background, nsamples=50)

# Explain test instances
X_test = np.random.randn(10, 10)
results = explainer(X_test)

# Confidence intervals (post-hoc)
ci = explainer.conf_int(alpha=0.05, target="X", alternative="two-sided")

EOT Options (Entropic OT)

EOTExplainer supports adaptive epsilon, stochastic transport sampling, and Gaussian/empirical targets:

from fdfi.explainers import EOTExplainer

explainer = EOTExplainer(
    model.predict,
    X_background,
    auto_epsilon=True,
    stochastic_transport=True,
    n_transport_samples=10,
    target="gaussian",  # or "empirical"
)
results = explainer(X_test)

Flow-DFI with FlowExplainer

FlowExplainer uses normalizing flows for non-Gaussian data, supporting both CPI (Conditional Permutation Importance) and SCPI (Sobol-CPI):

  • CPI: Average predictions first, then squared difference: $(Y - E[f(\tilde{X})])^2$
  • SCPI: Squared differences first, then average: $E[(Y - f(\tilde{X}_b))^2]$
from fdfi.explainers import FlowExplainer

# Create explainer with CPI (default)
explainer = FlowExplainer(
    model.predict,
    X_background,
    fit_flow=True,
    method='cpi',     # 'cpi', 'scpi', or 'both'
    num_steps=200,    # flow training steps
    nsamples=50,      # counterfactual samples
    sampling_method='resample',  # 'resample', 'permutation', 'normal', 'condperm'
)

results = explainer(X_test)
# results['phi_Z']: Z-space importance
# results['phi_X']: same as phi_Z (Z-space methods)

# Confidence intervals
ci = explainer.conf_int(alpha=0.05, target="Z", alternative="two-sided")

For advanced users, flow models can be trained separately:

from fdfi.models import FlowMatchingModel

# Train flow model externally
flow_model = FlowMatchingModel(X_background, dim=X_background.shape[1])
flow_model.fit(num_steps=500, verbose='final')

# Set pre-trained flow
explainer = FlowExplainer(model.predict, X_background, fit_flow=False)
explainer.set_flow(flow_model)

Project Structure

FDFI/
โ”œโ”€โ”€ fdfi/                  # Main package directory
โ”‚   โ”œโ”€โ”€ __init__.py       # Package initialization
โ”‚   โ”œโ”€โ”€ explainers.py     # Explainer classes
โ”‚   โ”œโ”€โ”€ plots.py          # Visualization functions
โ”‚   โ””โ”€โ”€ utils.py          # Utility functions
โ”œโ”€โ”€ tests/                 # Test suite
โ”‚   โ”œโ”€โ”€ test_explainers.py
โ”‚   โ”œโ”€โ”€ test_plots.py
โ”‚   โ””โ”€โ”€ test_utils.py
โ”œโ”€โ”€ docs/                  # Documentation & tutorials
โ”‚   โ””โ”€โ”€ tutorials/        # Jupyter notebook tutorials
โ”œโ”€โ”€ pyproject.toml        # Package configuration
โ””โ”€โ”€ README.md            # This file

Development Status

๐Ÿšง This is starter code for DFI development. The core structure and API are in place, but full implementations are coming soon.

Current status:

  • โœ… Package structure established
  • โœ… Base classes and interfaces defined
  • โœ… Testing framework set up
  • โœ… Documentation structure created
  • ๐Ÿšง Core algorithms (in development)
  • ๐Ÿšง Visualization functions (in development)

Testing

Run the test suite:

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov=fdfi --cov-report=html

Documentation

Full documentation and tutorials are available in the docs/ directory:

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

FDFI is based on:

  • Du, J.-H., Roeder, K., & Wasserman, L. (2025). Disentangled Feature Importance. arXiv preprint arXiv:2507.00260.
  • Chen, X., Guo, Y., & Du, J.-H. (2026). Flow-Disentangled Feature Importance. In The Thirteenth International Conference on Learning Representations (ICLR).

Related work:

  • SHAP: A game theoretic approach to explain machine learning models

Citation

If you use DFI in your research, please cite:

@software{dfi2026,
  title={DFI: Python Library for Disentangled Feature Importance},
  author={DFI Team},
  year={2026},
  url={https://github.com/jaydu1/FDFI}
}

@article{du2025disentangled,
  title={Disentangled Feature Importance},
  author={Du, Jin-Hong and Roeder, Kathryn and Wasserman, Larry},
  journal={arXiv preprint arXiv:2507.00260},
  year={2025}
}

@inproceedings{chen2026flow,
  title={Flow-Disentangled Feature Importance},
  author={Chen, Xin and Guo, Yifan and Du, Jin-Hong},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2026}
}

Contact

For questions and issues, please use the GitHub issue tracker.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fdfi-0.0.1.tar.gz (30.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fdfi-0.0.1-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file fdfi-0.0.1.tar.gz.

File metadata

  • Download URL: fdfi-0.0.1.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fdfi-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8c54b8e63418081053bb090499e856a58b941f0f208f5ef446291bc9bcc159a1
MD5 ba3494b77eb2ebc67f6d5fb0470cc115
BLAKE2b-256 44d8c17475b2d32b10f6a03d5cb6b9ed6e3ba4e2e221128f57f3307bec2f2b3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for fdfi-0.0.1.tar.gz:

Publisher: publish.yml on jaydu1/FDFI

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fdfi-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: fdfi-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fdfi-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1747b729201b2b3a7ce24db2525600edb2215f8f64934f9ed2f856b4ffe94948
MD5 0ba6c8cdd3e2b0cc42d442576ef221c3
BLAKE2b-256 9d07d01cf54f33e1c16bd299c12a769c7f6db577392442a3b5b291d51c5758f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for fdfi-0.0.1-py3-none-any.whl:

Publisher: publish.yml on jaydu1/FDFI

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page