Python bindings for C++ ranger random forests
Project description
skranger provides scikit-learn compatible Python bindings to the C++ random forest implementation, ranger, using Cython.
The latest release of skranger uses version 0.12.1 of ranger.
Installation
skranger is available on pypi and can be installed via pip:
pip install skranger
Usage
There are two sklearn compatible classes, RangerForestClassifier and RangerForestRegressor. There is also the RangerForestSurvival class, which aims to be compatible with the scikit-survival API.
RangerForestClassifier
The RangerForestClassifier predictor uses ranger’s ForestProbability class to enable both predict and predict_proba methods.
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from skranger.ensemble import RangerForestClassifier
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
rfc = RangerForestClassifier()
rfc.fit(X_train, y_train)
predictions = rfc.predict(X_test)
print(predictions)
# [1 2 0 0 0 0 1 2 1 1 2 2 2 1 1 0 1 1 0 1 1 1 0 2 1 0 0 1 2 2 0 1 2 2 0 2 0 0]
probabilities = rfc.predict_proba(X_test)
print(probabilities)
# [[0.01333333 0.98666667 0. ]
# [0. 0. 1. ]
# ...
# [0.98746032 0.01253968 0. ]
# [0.99 0.01 0. ]]
RangerForestRegressor
The RangerForestRegressor predictor uses ranger’s ForestRegression class. It also supports quantile regression using the predict_quantiles method.
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from skranger.ensemble import RangerForestRegressor
X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
rfr = RangerForestRegressor()
rfr.fit(X_train, y_train)
predictions = rfr.predict(X_test)
print(predictions)
# [18.39205325 21.41698333 14.29509221 35.34981667 27.64378333 20.98569135
# 21.15996673 14.0288093 9.44657947 29.99185 19.3774 11.88189465
# ...
# 11.08502822 36.80993636 18.29633154 12.90448354 20.94311667 11.45154934
# 41.44466667]
# enable quantile regression on instantiation
rfr = RangerForestRegressor(quantiles=True)
rfr.fit(X_train, y_train)
quantile_lower = rfr.predict_quantiles(X_test, quantiles=[0.1])
print(quantile_lower)
# [12.9 17. 8. 28. 22. 10.9 7. 8. 5. 20.8 16.9 7. 8. 18.
# 22. 19. 29. 21. 19. 19. 22. 10.9 20. 16. 14. 20. 9.8 22.9
# ...
# 16. 17. 12. 20. 13. 26. 19. 21.9 7. 14.9 13. 8. 17.9 7.9
# 29. ]
quantile_upper = rfr.predict_quantiles(X_test, quantiles=[0.9])
print(quantile_upper)
# [23. 27. 21. 44. 32.1 50. 50. 18.2 12. 43. 22. 17. 17. 24.
# 31.1 25. 37. 28. 23. 24. 28. 18. 28. 23. 23. 26. 17.1 43.
# ...
# 22. 24. 20. 28. 18. 44.2 24. 33.4 15.1 50. 21. 17. 25. 13.
# 50. ]
RangerForestSurvival
The RangerForestSurvival predictor uses ranger’s ForestSurvival class, and has an interface similar to the RandomSurvivalForest found in the scikit-survival package.
from sksurv.datasets import load_veterans_lung_cancer
from sklearn.model_selection import train_test_split
from skranger.ensemble import RangerForestSurvival
X, y = load_veterans_lung_cancer()
# select the numeric columns as features
X = X[["Age_in_years", "Karnofsky_score", "Months_from_Diagnosis"]]
X_train, X_test, y_train, y_test = train_test_split(X, y)
rfs = RangerForestSurvival()
rfs.fit(X_train, y_train)
predictions = rfs.predict(X_test)
print(predictions)
# [107.99634921 47.41235714 88.39933333 91.23566667 61.82104762
# 61.15052381 90.29888492 47.88706349 21.25111508 85.5768254
# ...
# 56.85498016 53.98227381 48.88464683 95.58649206 48.9142619
# 57.68516667 71.96549206 101.79123016 58.95402381 98.36299206]
chf = rfs.predict_cumulative_hazard_function(X_test)
print(chf)
# [[0.04233333 0.0605 0.24305556 ... 1.6216627 1.6216627 1.6216627 ]
# [0.00583333 0.00583333 0.00583333 ... 1.55410714 1.56410714 1.58410714]
# ...
# [0.12933333 0.14766667 0.14766667 ... 1.64342857 1.64342857 1.65342857]
# [0.00983333 0.0112619 0.04815079 ... 1.79304365 1.79304365 1.79304365]]
survival = rfs.predict_survival_function(X_test)
print(survival)
# [[0.95855021 0.94129377 0.78422794 ... 0.19756993 0.19756993 0.19756993]
# [0.99418365 0.99418365 0.99418365 ... 0.21137803 0.20927478 0.20513086]
# ...
# [0.87868102 0.86271864 0.86271864 ... 0.19331611 0.19331611 0.19139258]
# [0.99021486 0.98880127 0.95299007 ... 0.16645277 0.16645277 0.16645277]]
License
skranger is licensed under GPLv3.
Development
To develop locally, it is recommended to have asdf, make and a C++ compiler already installed. After cloning, run make setup. This will setup the ranger submodule, install python and poetry from .tool-versions, install dependencies using poetry, copy the ranger source code into skranger, and then build and install skranger in the local virtualenv.
To format code, run make fmt. This will run isort and black against the .py files.
To run tests and inspect coverage, run make test.
To rebuild in place after making changes, run make build.
To create python package artifacts, run make dist.
To build and view documentation, run make docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for skranger-0.3.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7fb6b473283a1b38b6019cdad06e6bd2e80a353bed20c941bd63b525c112b67 |
|
MD5 | 3cf21e00dee5e934a2411c68c61f7cd5 |
|
BLAKE2b-256 | 495ccdfdfdc7beb922badbf3cade237cc27392a327ef52ce28b3713a8bef6e4f |
Hashes for skranger-0.3.0-cp38-cp38-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3909ddf269a0305edd2c633b34562ce7ba3271364d07af88ce06c65507d7b721 |
|
MD5 | 1ee1139a5b94a8fe20c134d846457ef1 |
|
BLAKE2b-256 | 279aaf9f37fbf43c167b20896608041ff0e66f06398a160c611f6ce75fc72602 |
Hashes for skranger-0.3.0-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82a92969f386d0cdddaa91f1c52ce460596d58e1b821420142fe271e3bc1fa64 |
|
MD5 | f49ce9bc58315d9bffa638a1ec1f50b4 |
|
BLAKE2b-256 | 0bcb3ee3db39ff8d34d9ce4d29305e03a4cfeea8bafdbbbec4606b53faadc443 |
Hashes for skranger-0.3.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec8325c995d412cc01f0266de0f2a967efd5f24ef88687a0c80fd9a1b3c679a4 |
|
MD5 | 29a27bdba6740400c198744da565ee4d |
|
BLAKE2b-256 | e4ca0c1455c8892e5fcb74866620357c7ca2b3f19b51b17d2ed9cc533e4a7689 |
Hashes for skranger-0.3.0-cp37-cp37m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b47dd43d8c1978914653e4b6d3a0aa1af3b95d775d583838f5f65d9afbdcc75 |
|
MD5 | f7c70a5637c507f8b112742048d9ee83 |
|
BLAKE2b-256 | 7a13599ae56fce09e639787294aaf93cc874a949c7ce130a6c9be5dc1371be7a |
Hashes for skranger-0.3.0-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b989cbfb30ace53dd57256d0c3944ff70f4b2bec6f57bd57a3ba11bb348c3241 |
|
MD5 | 64e5cbb03b71de73dac62af4afac3fa0 |
|
BLAKE2b-256 | 1acff022728b70aa01ff837b2e2363cceec29eb36c01d1793746dbf12b305c1a |
Hashes for skranger-0.3.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45379d053b6381ddc7f26f1dff52330971a8fa968b47295070258bcd288e271d |
|
MD5 | c3cd7c6f60e1d4ec931866cacdd6e491 |
|
BLAKE2b-256 | bd46201f10141ad18f08afbd56d7398ae3da0131a2c754a617615df8015127a6 |
Hashes for skranger-0.3.0-cp36-cp36m-manylinux1_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66c12e587d63a84de0c0f73228f7750b233400e6762027836dc6ebf1355569c6 |
|
MD5 | 2a82905ec8f767b00a79f4fb5835cfec |
|
BLAKE2b-256 | 8864921d12ce93132a6981aa17d6eb900ef4dadd988049822300b19adea62066 |
Hashes for skranger-0.3.0-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87d907a7aefb5a61919b974c3c0fae649e7ebe7ad7eea8ddfaa6c02c2c7a4c78 |
|
MD5 | b72bbbbee59bb96cb38f8c6b48c6874d |
|
BLAKE2b-256 | 6eb542168c5eb3c85dda62e40a20c8343a9b9f41d4bcbd936d2dd9755da5d577 |