Random forest estimator
Project description
This is an experimental fork of Rforestry, for the package repo, see (https://github.com/forestry-labs/Rforestry)
Rforestry: Random Forests, Linear Trees, and Gradient Boosting for Inference and Interpretability
Sören Künzel, Theo Saarinen, Simon Walter, Sam Antonyan, Edward Liu, Allen Tang, Jasjeet Sekhon
Introduction
Rforestry is a fast implementation of Honest Random Forests, Gradient Boosting, and Linear Random Forests, with an emphasis on inference and interpretability.
How to install - R Package
- The GFortran compiler has to be up to date. GFortran Binaries can be found here.
- The devtools package has to be installed. You can install it using,
install.packages("devtools")
. - The package contains compiled code, and you must have a development environment to install the development version. You can use
devtools::has_devel()
to check whether you do. If no development environment exists, Windows users download and install Rtools and macOS users download and install Xcode. - The latest development version can then be installed using
devtools::install_github("forestry-labs/Rforestry")
. For Windows users, you'll need to skip 64-bit compilationdevtools::install_github("forestry-labs/Rforestry", INSTALL_opts = c('--no-multiarch'))
due to an outstanding gcc issue.
How to install - Python Package
The python package must be compiled before it can be used. Note that to compile and link the C++ version of forestry, one must be using either OSX or Linux and must have a C++ compiler installed. For example, one can run:
mkdir build
cd build
cmake .
make
Python Package Usage
Then the python code can be called:
import numpy as np
import pandas as pd
from random import randrange
from Rforestry import RandomForest
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
data = load_iris()
df = pd.DataFrame(data['data'], columns=data['feature_names'])
df['target'] = data['target']
X = df.loc[:, df.columns != 'sepal length (cm)']
y = df['sepal length (cm)']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
fr = RandomForest(ntree = 500)
print("Fitting the forest")
fr.fit(X_train, y_train)
print("Predicting with the forest")
forest_preds = fr.predict(X_test)
Plotting the forest
For visualizing the trees, make sure to install the dtreeviz python library.
from dtreeviz.trees import *
from forestry_shadow import ShadowForestryTree
shadow_forestry = ShadowForestryTree(fr, X, y, X.columns.values, 'sepal length (cm)', tree_id=0)
viz = dtreeviz(shadow_forestry,
scale=3.0,
target_name='sepal length (cm)',
feature_names=X.columns.values)
viz.view()
R Package Usage
set.seed(292315) test_idx <- sample(nrow(iris), 3) x_train <- iris[-test_idx, -1] y_train <- iris[-test_idx, 1] x_test <- iris[test_idx, -1]
rf <- forestry(x = x_train, y = y_train, nthread = 2)
predict(rf, x_test)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for random_forestry-0.10.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a77ef818eff36b37e3aa37670f14edd8b4de5721817ca8cde40037cb68f53f70 |
|
MD5 | 47b762e92e33c2951e5200af3eca3944 |
|
BLAKE2b-256 | c32d15a299e79b7463f3987bcccdcc2312cedfcb9968b83f2c75b91b0c40d060 |
Hashes for random_forestry-0.10.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21af63d2b34a5168d116c46f9ae9c09784e25ae32a14023bfc732dd6c06c96eb |
|
MD5 | bf16cb615a2f4cad49262c46e5ecae7a |
|
BLAKE2b-256 | cfc606ddb088592b767a907be02d59d7f313b7f3559456495b1e3965a49830d1 |
Hashes for random_forestry-0.10.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bedc1aa1cb55ec582accc35bcf9ef1bbedd3e0deaf01d9ddc051972fed7aed63 |
|
MD5 | cbb16058911989e0178a70fedfae9149 |
|
BLAKE2b-256 | 1f07aeafb55e26fbe0ff72d1d1bb66cdecdf4e85373817aed850c1a107849927 |
Hashes for random_forestry-0.10.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 68a0121a00c813abb5942e3d824a0e198c90b6278bf43beb3837156480ac0202 |
|
MD5 | c643cb7098605cb29c662924e7d14e05 |
|
BLAKE2b-256 | cb0bb6d919cc69e5d6018831f57ca5cb51f173bad206992e192968a5dfc5e7c8 |
Hashes for random_forestry-0.10.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 607a221c1527b6b5ad91797fa7bab4151b696dc0bd818664ff4488ffa0aa1dc0 |
|
MD5 | 92fdc58f4cae5c8ba2b693f9a4b598b8 |
|
BLAKE2b-256 | 651f8705ab265b0e6a2aab50c70eaf080971e6179bcea71bff6de77734b66d06 |
Hashes for random_forestry-0.10.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc37bee1ae41c0eb587404c57da61393cb815694b8ad6b6701ec62fcf4e8aa7e |
|
MD5 | 99ec4b595d4129705f3a43611b63920f |
|
BLAKE2b-256 | 6f115a051d1be3cce4f0f9fe4362151ae6cad94b9c1f0764888fc2deeab33f23 |
Hashes for random_forestry-0.10.0-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3d81376ac0c88668e276e04d51d7d589d077b8f88b7623b98ec0c40cc2ccf87 |
|
MD5 | 8689b8adf1ea9ad336f11609d761c345 |
|
BLAKE2b-256 | fbeb893c71d96b2b6d2f81c7c289263f07c1b1e209f3d6ed3386b79f492f4325 |
Hashes for random_forestry-0.10.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94ce03790a7976f99504c6a567eb22c2affc30a8f11bc7f5c754a53839a28ca0 |
|
MD5 | 156e263c10cba638139427884d971250 |
|
BLAKE2b-256 | a9522f88df93ef7a5b44ac2cdf144c426cbada7ad15c0ddc11d2e0105eedef0e |
Hashes for random_forestry-0.10.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c33d4dd3980645c126d62cf427c08e1a18a2015a229d69f75282e3c15b9e741 |
|
MD5 | 32a1ad2c6f39bc99a7a6320d281df5b8 |
|
BLAKE2b-256 | 46bac59ee0c8fa2b1729b676eafc3ecfcecf3b3f66f089d259d6fbdef0302792 |
Hashes for random_forestry-0.10.0-cp38-cp38-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42a98b786b1b75740c339e6f61cec452ad5deed6c800614c5a8b0bc9156a8b18 |
|
MD5 | 7e4ab1aa03a750aca214535d417aa692 |
|
BLAKE2b-256 | 1c0babf7153fc9592e01485f711974e8cb5f603669b2fca231b807c9da877a85 |
Hashes for random_forestry-0.10.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1043a2f08cf6c52c07518caf5141728a9872ac1854a5ab52ff04454708c2761d |
|
MD5 | c3e8920a07e6cfc510b10e3552edcfbb |
|
BLAKE2b-256 | 68fa4f70d966087f39af9fa57c8717e5cf9a6b4cbfea501144280ecfe3b3c4e6 |
Hashes for random_forestry-0.10.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ff1221bef9bcef84ed05511ae204c9d06f648195fadd88ba08e6fe287702166 |
|
MD5 | 29c4773d45c030c84a057f0c064b565c |
|
BLAKE2b-256 | 3fbf74c12efecdb9220ddb8e24cfd596f7a80ef2b6d380fbcd5b95fc6ef8bdf1 |