Skip to main content

A package to visualize decision trees as Sankey diagrams using Plotly.

Project description

DecisionTree-to-Sankey :evergreen_tree: :leaves:

DecisionTree-to-Sankey is a Python library that visualizes decision trees (from scikit-learn) as interactive Sankey diagrams using Plotly. Decision trees are known for their interpretability, but large trees can become hard to read when plotted traditionally. This library presents decision trees as Sankey diagrams, where nodes can be dragged to adjust for overlapping labels, and conditions can be inspected interactively by hovering over the branches.

Features

  • Interactive Sankey Diagram: Visualize decision trees with adjustable nodes and hover-over conditions.
  • Improved Readability: Handles overlapping labels by allowing users to drag nodes.
  • Easy Integration: Use this tool with any decision tree created by scikit-learn.

Installation

  • Option 1: Installing via pip (if published to PyPI) Once the package is published on PyPI, you can install it via pip:
pip install decisiontree-to-sankey
  • Option 2: Manual Installation Clone the repository and install the environment via conda:
  1. Clone the repository:
git clone https://github.com/LukeADay/DecisionTree-to-Sankey.git
  1. Create the conda environment:
conda env create -f environment.yml
  1. Activate the environment
conda activate tree-sankey-visualizer
  1. Install the package
pip install .

Usage

Example Code

After installing the library, you can import and use it as follows:

from decisiontree_to_sankey import DecisionTree_to_Sankey
from sklearn.tree import DecisionTreeClassifier
import pandas as pd

# Sample data
data = pd.DataFrame({
    'Feature1': [1, 2, 3, 4],
    'Feature2': [5, 6, 7, 8]
})
target = [0, 1, 0, 1]

# Train a decision tree
clf = DecisionTreeClassifier()
clf.fit(data, target)

# Create and visualize the Sankey diagram
dt_sankey = DecisionTree_to_Sankey(clf, data)
dt_sankey.create_sankey()  # Displays the interactive Sankey diagram

Warning check the complexity of the tree before plotting. A tree that is too complex will not work - this requires judgement. See the complexity of the trained model:

try:
    n_leaves = regressor.get_n_leaves()
    depth = regressor.get_depth()
    print(f"Regressor is trained with {n_leaves} leaves and depth {depth}.")
except AttributeError:
    print("The regressor is not trained (empty).")

Output Example

The following is an example of the Sankey diagram output. Nodes overlap initially, but the interactive version allows you to drag nodes around for better readability:

Sankey Diagram

See sankey_diagram.html for an interactive version.

Repository Structure

├── LICENSE
├── README.md
├── conda_requirements.txt
├── environment.yml
├── examples
│   ├── __init__.py
│   ├── penguine_dataset_example.ipynb
│   ├── penguine_dataset_example.py
│   ├── sankey_diagram.html
│   └── sankey_diagram.png
├── requirements.txt
├── setup.py
├── src
│   ├── DecisionTree_To_Sankey.py
│   ├── __init__.py
│   └── __pycache__
│       ├── DecisionTree_To_Sankey.cpython-310.pyc
│       └── __init__.cpython-310.pyc
└── tests
    ├── __pycache__
    │   └── test_decisiontree_to_sankey.cpython-310.pyc
    └── test_decisiontree_to_sankey.py
  • The src/decisiontree_to_sankey module contains the core DecisionTree_to_Sankey class.
  • Examples of how to use the library are available in the examples/ folder.
  • Unit tests are provided in the tests/ folder.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

decisiontree_to_sankey-0.1.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

decisiontree_to_sankey-0.1-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file decisiontree_to_sankey-0.1.tar.gz.

File metadata

  • Download URL: decisiontree_to_sankey-0.1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.10

File hashes

Hashes for decisiontree_to_sankey-0.1.tar.gz
Algorithm Hash digest
SHA256 ec37c14489c76898e213e2160834f02a660133bc7ad1fec188213fce482d19c5
MD5 e79e61caabd8a683a7f586d69d21e96c
BLAKE2b-256 270ff9ca4acb03b49a4e2962415eaaab738da733c3d41dd68ccf36a030562b7f

See more details on using hashes here.

File details

Details for the file decisiontree_to_sankey-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for decisiontree_to_sankey-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bb994ec88313e6642c9aa9a9a874cb3683e4d467c738b67fefe51bc72a230bee
MD5 ed77566958710adc5f8e09ded9672998
BLAKE2b-256 803b53d95bd386d46c14409fc2b32d624c4333f4a884c0877cb17aa7813abf11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page