Skip to main content

Modern decision trees in Python

Project description

Code style: black CircleCI Main Checked with mypy codecov PyPI Download count Latest PyPI release DOI

scikit-tree

scikit-tree is a scikit-learn compatible API for building state-of-the-art decision trees. These include unsupervised trees, oblique trees, uncertainty trees, quantile trees and causal trees.

Tree-models have withstood the test of time, and are consistently used for modern-day data science and machine learning applications. They especially perform well when there are limited samples for a problem and are flexible learners that can be applied to a wide variety of different settings, such as tabular, images, time-series, genomics, EEG data and more.

Documentation

See here for the documentation for our dev version: https://docs.neurodata.io/scikit-tree/dev/index.html

Why oblique trees and why trees beyond those in scikit-learn?

In 2001, Leo Breiman proposed two types of Random Forests. One was known as Forest-RI, which is the axis-aligned traditional random forest. One was known as Forest-RC, which is the random oblique linear combinations random forest. This leveraged random combinations of features to perform splits. MORF builds upon Forest-RC by proposing additional functions to combine features. Other modern tree variants such as Canonical Correlation Forests (CCF), Extended Isolation Forests, Quantile Forests, or unsupervised random forests are also important at solving real-world problems using robust decision tree models.

Installation

Our installation will try to follow scikit-learn installation as close as possible, as we contain Cython code subclassed, or inspired by the scikit-learn tree submodule.

Dependencies

We minimally require:

* Python (>=3.9)
* numpy
* scipy
* scikit-learn >= 1.3

Installation with Pip (https://pypi.org/project/scikit-tree/)

Installing with pip on a conda environment is the recommended route.

pip install scikit-tree

Building locally with Meson (For developers)

Make sure you have the necessary packages installed

# install build dependencies
pip install -r build_requirements.txt

# you may need these optional dependencies to build scikit-learn locally
conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp

We use the spin CLI to abstract away build details:

# run the build using Meson/Ninja
./spin build

# you can run the following command to see what other options there are
./spin --help
./spin build --help

# For example, you might want to start from a clean build
./spin build --clean

# or build in parallel for faster builds
./spin build -j 2

# you will need to double check the build-install has the proper path
# this might be different from machine to machine
export PYTHONPATH=${PWD}/build-install/usr/lib/python3.9/site-packages

# run specific unit tests
./spin test -- sktree/tree/tests/test_tree.py

# you can bring up the CLI menu
./spin --help

You can also do the same thing using Meson/Ninja itself. Run the following to build the local files:

# generate ninja make files
meson build --prefix=$PWD/build

# compile
ninja -C build

# install scikit-tree package
meson install -C build

export PYTHONPATH=${PWD}/build/lib/python3.9/site-packages

# to check installation, you need to be in a different directory
cd docs;  
python -c "from sktree import tree"
python -c "import sklearn; print(sklearn.__version__);"

After building locally, you can use editable installs (warning: this only registers Python changes locally)

pip install --no-build-isolation --editable .

Or if you have spin v0.8+ installed, you can just run directly

spin install

Development

We welcome contributions for modern tree-based algorithms. We use Cython to achieve fast C/C++ speeds, while abiding by a scikit-learn compatible (tested) API. Moreover, our Cython internals are easily extensible because they follow the internal Cython API of scikit-learn as well.

Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a fork of scikit-learn at https://github.com/neurodata/scikit-learn when extending the decision tree model API of scikit-learn. Specifically, we extend the Python and Cython API of the tree submodule in scikit-learn in our submodule, so we can introduce the tree models housed in this package. Thus these extend the functionality of decision-tree based models in a way that is not possible yet in scikit-learn itself. As one example, we introduce an abstract API to allow users to implement their own oblique splits. Our plan in the future is to benchmark these functionalities and introduce them upstream to scikit-learn where applicable and inclusion criterion are met.

References

[1]: Li, Adam, et al. "Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks" SIAM Journal on Mathematics of Data Science, 5(1), 77-96, 2023

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_tree-0.7.0.tar.gz (16.4 MB view details)

Uploaded Source

Built Distributions

scikit_tree-0.7.0-cp312-cp312-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.12 Windows x86-64

scikit_tree-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

scikit_tree-0.7.0-cp312-cp312-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

scikit_tree-0.7.0-cp312-cp312-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

scikit_tree-0.7.0-cp311-cp311-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.11 Windows x86-64

scikit_tree-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

scikit_tree-0.7.0-cp311-cp311-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

scikit_tree-0.7.0-cp311-cp311-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

scikit_tree-0.7.0-cp310-cp310-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.10 Windows x86-64

scikit_tree-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

scikit_tree-0.7.0-cp310-cp310-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

scikit_tree-0.7.0-cp310-cp310-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

scikit_tree-0.7.0-cp39-cp39-win_amd64.whl (5.1 MB view details)

Uploaded CPython 3.9 Windows x86-64

scikit_tree-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

scikit_tree-0.7.0-cp39-cp39-macosx_11_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

scikit_tree-0.7.0-cp39-cp39-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

File details

Details for the file scikit_tree-0.7.0.tar.gz.

File metadata

  • Download URL: scikit_tree-0.7.0.tar.gz
  • Upload date:
  • Size: 16.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for scikit_tree-0.7.0.tar.gz
Algorithm Hash digest
SHA256 4a29cf6c29f9f84ad1753e6b918fd3738faacc6b16d76a60ff39a87d07182f98
MD5 a515c46249448ff49fdcbc90eb17b8fd
BLAKE2b-256 ca18e8b829ead1cbe56120401398d427c9873757905dee57e623dba42828bafd

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 9d5e45642f33db1e2b46799de94e444cd3f6709e5a5f4d93a67946d36d203582
MD5 aba689972379bb01c517ea91353e46ba
BLAKE2b-256 c36a48f207ab3973ea7116980ec1f39cb07b1a8c03fb44be25987e4a5036fca6

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2482e77c209ea91efbc1347f7dcbbea50427440fb4715ec1561e0dc7d78e355f
MD5 340924300c41aba8e6b802bf342322bd
BLAKE2b-256 b452e425e95503d84f985ef2478eec3af18ecd182f564be6273b1c8afe7cdaea

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 caf1e4d681e494fa30f6d1d18f45d5de6cd3f46fc197a6ef1875e9a496fdb601
MD5 6e5bd8c68453dcf6245b146c14d7f736
BLAKE2b-256 0d24f68dbd76395c0ab75fb7bf43dd8a2727de6aaa684dc7fecb7ec84cf871aa

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b246e0af6bb958a52942bfec41b539087ce8f1e18bf08d7b56a0f7be321c775c
MD5 a4dccda88de07b8ab90a87fe9156e857
BLAKE2b-256 989c2ebc2cfd3ab04a0e3982d78e0a635835430c784617fbef067938c7eb8951

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 f2a5bf454c2b95465e073929df14b140f71f3d11efb54af576faacee93e523d1
MD5 2165dab3615297750a3960795b38582c
BLAKE2b-256 704d099896839f0815c7c13cf34b50109ee8217efcb65e4e61b92c771101b1d2

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ed343e741b9b7f1e2ac7f0695b6363c30a24cf7da3a4e4e33993273f107fd7a3
MD5 fea986b1f411c126e6e75a49e2fb28d6
BLAKE2b-256 e4d8515fcc32e3a78b2436c8e4e58f993e69520904bc13c0fb4a64ac3ad22e52

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8a7c6ddb8df042f9910ead2d31dcc1d837e6823218eb25e098aac607db2e46fd
MD5 efd852b5afb579e6c04cb048c05d6eeb
BLAKE2b-256 c6689d67c0134b0f14cb2203b634480bf8d2272b6803efe1cde45a2e0add690f

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 9097f61bab1f0c0f0e1b89bd54653b54f7f760e9102fe9229ac0d9ddafd0fe77
MD5 590684a133fd970bc14361bdbff2ef72
BLAKE2b-256 7e51b8991369fa580e121509cd781a05b5e7ea12c1af78d8f195b26c6291e38b

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 1427a587a0fdfdcd3faf68d69a63e3bdf15dc031c260d997fd5a2b8f04f1f42e
MD5 2aa621ffc716e4d714d06a23e0141f18
BLAKE2b-256 d33af76bc05c4caf45986eac019e79191cbd675843e6f89a588232c98c2ccab9

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9ee473d073dc9c80b4bc460d693dd29bca1cdf11aa98fa61d3a3f5b24c63dbb4
MD5 ce8cbb0a14e6438ec437670f6abee612
BLAKE2b-256 5f94441243d164c9514e64e81a5282d5302b4a02af54c7e6df6668107ef8c5ab

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 77744220d8d962ac3e2c382b54025722601ddc70813abdfa9222c41d6fc09f0b
MD5 354ce969dbbff94c2ce9fe8b1bdee4a0
BLAKE2b-256 49a408fd9ded5ee9327eefdb7a905c35e2cbe013bd30de0e0e8dcb9bc5ce76af

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 853a9523e69faa849d23ba04f8ae56654afe53f0e899339d65adffa738058550
MD5 110227aef81c167c60ddddb9d4770509
BLAKE2b-256 dd88bb96b7030c116c9e7742e0c170647ebcf0d57f00dbf4ec12de0839fe6a11

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 754f94d628ec6fbdf364a4baa2e79754612ac395522cbd6442035109cad6c8f6
MD5 a7bd8debe500adcbb461f7ca3099c0c4
BLAKE2b-256 a60483c1f9b3078b19bf94da5509a69351e7df28a22fc57cb048d66f67092519

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ca467985b8bfd5f023dad4814a4481e9933531cfe6cd0b8e38f004c3c34d1a15
MD5 a50c1cde9c99dc9a76ec0bf57abca067
BLAKE2b-256 8d2d7bc1117ff88de352cbaf9d43815c971d39d64eb53c01862a0175822488f8

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ad6e2ab0563d0957f247dee4808996339d5a899f5dc82d5a9579d2dfbdf5de15
MD5 789e2912745cc6d02c51670c34ac0503
BLAKE2b-256 98e64335fb7575e72e27704985f7ad1adbba065e5705587758384aa4d4f45144

See more details on using hashes here.

File details

Details for the file scikit_tree-0.7.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.7.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4b51dc2806da909d230f2edcf771da9212f0c46d656560787ec78edc44e672fa
MD5 5272b6d15118272ea646438ce943cb8e
BLAKE2b-256 7fe5b734ec41424936941987d12df3115ecbd69632eece2607427877f45a4bdc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page