Modern decision trees in Python
Project description
scikit-tree
scikit-tree is a scikit-learn compatible API for building state-of-the-art decision trees. These include unsupervised trees, oblique trees, uncertainty trees, quantile trees and causal trees.
Tree-models have withstood the test of time, and are consistently used for modern-day data science and machine learning applications. They especially perform well when there are limited samples for a problem and are flexible learners that can be applied to a wide variety of different settings, such as tabular, images, time-series, genomics, EEG data and more.
We welcome contributions for modern tree-based algorithms. We use Cython to achieve fast C/C++ speeds, while abiding by a scikit-learn compatible (tested) API. Moreover, our Cython internals are easily extensible because they follow the internal Cython API of scikit-learn as well.
Submodule dependency on a fork of scikit-learn
Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a maintained fork of scikit-learn at https://github.com/neurodata/scikit-learn, where specifically, the fork
branch is used to build and install this repo. We keep that fork well-maintained and up-to-date with respect to the main sklearn repo. The only difference is the refactoring of the tree/
submodule. This fork is used internally under the namespace sktree._lib.sklearn
. It is necessary to use this fork for anything related to:
RandomForest*
ExtraTrees*
- or any importable items from the
tree/
submodule, whether it is a Cython or Python object
If you are developing for scikit-tree, we will always depend on the most up-to-date commit of https://github.com/neurodata/scikit-learn/submodulev2
as a submodule within scikit-tee. This branch is consistently maintained for changes upstream that occur in the scikit-learn tree submodule. This ensures that our fork maintains consistency and robustness due to bug fixes and improvements upstream.
Documentation
See here for the documentation for our dev version: https://docs.neurodata.io/scikit-tree/dev/index.html
Why oblique trees and why trees beyond those in scikit-learn?
In 2001, Leo Breiman proposed two types of Random Forests. One was known as Forest-RI
, which is the axis-aligned traditional random forest. One was known as Forest-RC
, which is the random oblique linear combinations random forest. This leveraged random combinations of features to perform splits. MORF builds upon Forest-RC
by proposing additional functions to combine features. Other modern tree variants such as Canonical Correlation Forests (CCF), or unsupervised random forests are also important at solving real-world problems using robust decision tree models.
Installation
Our installation will try to follow scikit-learn installation as close as possible, as we contain Cython code subclassed, or inspired by the scikit-learn tree submodule.
AS OF NOW, scikit-tree is in development stage and the installation is still finicky due to the upstream scikit-learn's stalled refactoring PRs of the tree submodule. Once those are merged, the installation will be simpler. The current recommended installation is done locally with meson.
Dependencies
We minimally require:
* Python (>=3.8)
* numpy
* scipy
* scikit-learn >= 1.3
Installation with Pip
pip install sktree
Building locally with Meson (RECOMMENDED)
Make sure you have the necessary packages installed
# install build dependencies
pip install numpy scipy meson ninja meson-python Cython scikit-learn scikit-learn-tree
# you may need these optional dependencies to build scikit-learn locally
conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp
We use the spin
CLI to abstract away build details:
# run the build using Meson/Ninja
./spin build
# you can run the following command to see what other options there are
./spin --help
./spin build --help
# For example, you might want to start from a clean build
./spin build --clean
# or build in parallel for faster builds
./spin build -j 2
# you will need to double check the build-install has the proper path
# this might be different from machine to machine
export PYTHONPATH=${PWD}/build-install/usr/lib/python3.9/site-packages
# run specific unit tests
./spin test -- sktree/tree/tests/test_tree.py
# you can bring up the CLI menu
./spin --help
You can also do the same thing using Meson/Ninja itself. Run the following to build the local files:
# generate ninja make files
meson build --prefix=$PWD/build
# compile
ninja -C build
# install scikit-tree package
meson install -C build
export PYTHONPATH=${PWD}/build/lib/python3.9/site-packages
# to check installation, you need to be in a different directory
cd docs;
python -c "from sktree import tree"
python -c "import sklearn; print(sklearn.__version__);"
Alternatively, you can use editable installs
pip install --no-build-isolation --editable .
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file scikit_tree-0.1.4.tar.gz
.
File metadata
- Download URL: scikit_tree-0.1.4.tar.gz
- Upload date:
- Size: 13.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85c943776f10b4342e69ad8faa4e8aefbbd948c697b819cbc0c79542d6cfbc33 |
|
MD5 | ce1c1ca1ec1bf54dab376e05d0cca55d |
|
BLAKE2b-256 | d4a39f30d8d7446d0432e7f607eece930fbb22966e2f101657bad04dc1811d5d |
File details
Details for the file scikit_tree-0.1.4-cp311-cp311-win_amd64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 12.4 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 469d4a4ca80a09c52aa0c2242ab9678140547c81d58b721fe3379ead2e891842 |
|
MD5 | d489b3a40f156c2f15f1082cb55d9c83 |
|
BLAKE2b-256 | 31d4232a72270e8cabdc20f941068932d06aff9003b99ec681722030f6430960 |
File details
Details for the file scikit_tree-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e25e90d5f21e12a8cffca9d12f0831ee2d50a938bb262d9a8837639615b16e46 |
|
MD5 | b038a5ba2742bb5b4e245968a92cb0d5 |
|
BLAKE2b-256 | fe6a6d80355e846224b7a8555b295feb481a22dd20cb0e029b512ef20ac5da47 |
File details
Details for the file scikit_tree-0.1.4-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 093a7c9dba2c308724aeb1b1a5c2174053a1afaceec1ab3adeb8c20752b45ab9 |
|
MD5 | d78841d6a003dd13143c21cdc06e6e30 |
|
BLAKE2b-256 | 79433574afb73f9420b4a24adaccc2af763fefd9ef094aabd05d7bf4ed587919 |
File details
Details for the file scikit_tree-0.1.4-cp311-cp311-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2ef213327fa06861b61ec2068184965b9cf84a5f2263ab32c119197df3f939d |
|
MD5 | a19400941e3a650c4ae3727aae5092e4 |
|
BLAKE2b-256 | 5e85bb44c333d768c63e847101bbb1ecff01990a788a00211c26eeda173d4b4f |
File details
Details for the file scikit_tree-0.1.4-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 12.4 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebde8f48d1d1ced4e8b79d3ead4621f3d5bc53112a17deda6089b39459879f91 |
|
MD5 | b1c6585949a3663ec096364f3f0d68fb |
|
BLAKE2b-256 | 916c41fab4f9ecbe84ae9ddf80b450955fc673bb468cae67aca12269c34dbced |
File details
Details for the file scikit_tree-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f5b07c93306d7d627d732b0f7f85d9ebc7cf5bf78db0f8ed44c89d613e87d43 |
|
MD5 | 2990f9f20bc5de6b9e75df009e575d94 |
|
BLAKE2b-256 | 2a942cd73da2065384ad11dceda89e90fed50ce0f73c9ecaa0664d346545c52a |
File details
Details for the file scikit_tree-0.1.4-cp310-cp310-macosx_11_0_arm64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9a3ea5d539436f163b5f4376249b3971c968415097aadbd14412460f2b53400 |
|
MD5 | c844b050a4f218df37c5005ac67cb1f3 |
|
BLAKE2b-256 | 9310a3780eef5d4deb049867e9377416c811fe9457aeb9d46b72064b96ed1f52 |
File details
Details for the file scikit_tree-0.1.4-cp310-cp310-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 150be0bb92a83a8fcae9c2578e80df4faa25ea230413a5269b4281180b9c0aa2 |
|
MD5 | 181a488e3df8bdf166341a2c453f4a48 |
|
BLAKE2b-256 | a4a5fbe5068627958f50c53c4b4ac1bc4c396e3f927a6a3aec0224a0b17ad928 |
File details
Details for the file scikit_tree-0.1.4-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 12.5 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c989c5b0c7462d0fed9ba8df14167a0177b7247def2534bdad971a2f68246079 |
|
MD5 | eb9c7b30e7668e7b4a72022fb1114184 |
|
BLAKE2b-256 | 6413cfc482b80f689a47053a9974c2f722479a87249381292c79a31f85df5768 |
File details
Details for the file scikit_tree-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9131b9e03e6e2845b6721bd2d331b77b324c3b513cb439a3bb17c66eaa66cd2f |
|
MD5 | ee03f61e8940101bf5f0784e0be837d1 |
|
BLAKE2b-256 | 807d3d8802fcf7670c4dfa9c497b1cda6fa6ff3147063b3976aea72e81e4e222 |
File details
Details for the file scikit_tree-0.1.4-cp39-cp39-macosx_11_0_arm64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.4 MB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6639f4b7ddf0b6d982323361958c6d7811a066b555be2699ff63960e6209d9b |
|
MD5 | 2c4c691b63d68978f69fa49e651b7ec7 |
|
BLAKE2b-256 | 824b802e5c8cac806021a22c40a25d65288978880ca6ee757408df0684d18834 |
File details
Details for the file scikit_tree-0.1.4-cp39-cp39-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: scikit_tree-0.1.4-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35f8f76845e466fa5bef37b98a48e9c16fcbe578ce55756fb473022f00356a8f |
|
MD5 | 4939159c98c0c7c80a187460e7463902 |
|
BLAKE2b-256 | 96bcf5c0e9fea176d2faadb9af8cc30b80ba31ff59bfe150ca32c7e1bac03a6e |