Skip to main content

Custom Split Decision Tree

Project description

Custom Split Decision Tree (CSDT)

CSDT is a Python library designed for building and using decision trees with custom split criteria. It provides flexibility and control for machine learning projects, allowing users to define custom split logic and visualize the resulting trees.

Explore the project on the official website: Custom Split Decision Tree


Features

  • Custom Split Logic: Easily define custom split criteria for decision tree nodes.
  • Tree Visualization: Generate high-quality tree visualizations using graphviz.
  • Flexible Splitting Criteria: Works with user-defined splitting functions and evaluation metrics.
  • Seamless Integration: Fully compatible with Python data science libraries like pandas and scikit-learn.

Installation

You can install the library via pip:

pip install csdt

Alternatively, you can clone the repository and install it locally:

git clone https://github.com/sibirbil/CSDT.git
cd CSDT
pip install -e .

Usage

Here's an example of how to use CSDT to create and visualize a decision tree:

import pandas as pd
from csdt import CSDT,split_criteria_with_methods
import numpy as np 

# Sample data
data = pd.DataFrame({
    "feature1": [1, 2, 3, 4, 5],
    "feature2": [5, 4, 3, 2, 1],
    "target": [1, 0, 1, 0, 1]
})

X = data[["feature1", "feature2"]]
y = data[["target"]]
def return_mean(y, x):
        return y.mean(axis=0).astype(np.float64)  

def calculate_mse(y, predictions,initial_solutions):
    errors = y - predictions
    squared_errors = errors ** 2
    mse = np.mean(squared_errors)
    return np.float64(mse)
        
split_criteria = lambda y, x,initial_solutions: split_criteria_with_methods(y, x,pred=return_mean, split_criteria= calculate_mse,initial_solutions=initial_solutions
            )
# Initialize the tree
tree = CSDT(max_depth=3, min_samples_split=2, min_samples_leaf=1, verbose=True,split_criteria=split_criteria,use_hashmaps=True)

# Fit the tree
tree.fit(X, y)

# Visualize the tree
dot = tree.draw_tree()
dot.render("decision_tree", format="png")

This code will create a tree visualization and save it as decision_tree.png.


Requirements

To use CSDT, make sure you have the following dependencies installed:

  • matplotlib==3.8.1
  • numpy==2.1.0
  • pandas==2.2.3
  • scikit-learn==1.3.2
  • graphviz
  • gurobi=10.0.3
  • pip=23.3.1
  • python=3.11.6
  • seaborn=0.12.2
  • scipy=1.11.3
  • setuptools=68.2.2

If you use conda, you can create an environment with all required dependencies:

conda env create -f csdt.yml
conda activate csdt

Conda Environment Setup

For Conda users, you can create an environment using the provided csdt.yml file:

  1. Download or copy the csdt.yml file.
  2. Create the environment:
    conda env create -f csdt.yml
    
  3. Activate the environment:
    conda activate csdt
    

License

This project is licensed under the MIT License. See the LICENSE file for details.


Contributing

Contributions are welcome! If you'd like to contribute, follow these steps:

  1. Fork the repository.
  2. Create a new branch for your feature (git checkout -b feature-name).
  3. Commit your changes (git commit -m 'Add new feature').
  4. Push to the branch (git push origin feature-name).
  5. Open a Pull Request.

Authors


Acknowledgments

Special thanks to all contributors and the open-source community for their support in making this project possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csdt-1.0.9.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csdt-1.0.9-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file csdt-1.0.9.tar.gz.

File metadata

  • Download URL: csdt-1.0.9.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.6

File hashes

Hashes for csdt-1.0.9.tar.gz
Algorithm Hash digest
SHA256 c14324d8101dd9eb737cb4ba99b488b499a76ced51acd16dd36e74b01b5df690
MD5 048cc68930ce1b4551904182c5c0b3a6
BLAKE2b-256 dcaf351105592ae3d5940b1dc2e27c5e04d0f674cad0e6f88e98022a465299a0

See more details on using hashes here.

File details

Details for the file csdt-1.0.9-py3-none-any.whl.

File metadata

  • Download URL: csdt-1.0.9-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.6

File hashes

Hashes for csdt-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 73bc0430895e61a94d1f8e2ddef3580a0feb858b46d9b2492bf02016bcb2e348
MD5 3f90efeff3e9bb6de8af0ee80dc9e15f
BLAKE2b-256 212e0f1fb9fd5d028e0030d8b01833fdd21bf10edf45c0bb93af2ebd647f6239

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page