Implementation of Reconstruction-based Anomaly Detection with Completely Random Forest
Project description
This is the implementation of RecForest for anomaly detection, proposed in the paper “Reconstruction-based Anomaly Detection with Completely Random Forest,” SIAM International Conference on Data Mining (SDM), 2021. It is highly optimized and provides Scikit-Learn like APIs.
Installation
RecForest is available at PyPI:
$ pip install recforest
Build from Source
To use RecForest, you first need to install the package from source:
$ git clone https://github.com/xuyxu/RecForest.git
$ cd RecForest
$ python setup.py install
Notice that a C compiler is required to compile the pyx files (e.g., GCC on Linux, and MSVC on Windows). Please refer to Cython Installation for details.
Example
The code snippet below presents the minimal example on how to use RecForest for anomaly detection. Scripts on reproducing experiment results in the paper are available in the directory examples.
from recforest import RecForest
model = RecForest()
model.fit(X_train)
y_pred = model.predict(X_test)
Documentation
RecForest only has two hyper-parameters: n_estimators and max_depth. Docstrings on the input parameters are listed below.
n_estimators: Specify the number of decision trees in Recforest;
max_depth: Specify the maximum depth of decision trees in Recforest;
n_jobs: Specify the number of workers for joblib parallelization. -1 means using all processors;
random_state: Specify the random state for reproducibility.
RecForest has three public methods. Docstrings on these methods are listed below. Notice that for all methods, the accepted data format of input X is numpy array of the shape (n_samples, n_features).
fit(X): Fit a RecForest using the input data X;
apply(X): Return the leaf node ID of input data X in each decision tree;
predict(X): Return the anomaly score on the input data X.
Package Dependencies
numpy >= 1.13.3
scipy >= 0.19.1
joblib >= 0.12
cython >= 0.28.5
scikit-learn >= 0.22
A Python environment installed from conda is highly recommended. In this case, there is no need to install any package listed above.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for RecForest-0.1.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8324ff1188c4871f3d5877a638bfac10a79fef5ce7fdb668d553807b42521e9a |
|
MD5 | 502b26d2ef6c38a42ad1e83fb7698e92 |
|
BLAKE2b-256 | 9821bc4faa118ea67a45897d9551087ea70152850ec372bb35217a2a89ed2f17 |
Hashes for RecForest-0.1.0-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19416788bd2cc6ec73b65dd74a5bb23138606b8ad8c512e71bcdaffe6286658b |
|
MD5 | bcbd849caeabb316068cfe839c71c76c |
|
BLAKE2b-256 | 9b1965c1fefb8024c497f4dbd400d6bb0774fd68bf9474efe9982b6980975f61 |
Hashes for RecForest-0.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5c3b556982c45903740f239c98126fec4178e0289e02f83c00ff262a4578067 |
|
MD5 | b3da7f60d6316e635cbc272a9ab59626 |
|
BLAKE2b-256 | 4bfb00f6a0095b4209416323f3fbccc4dcaa47488be76fe558593b94da4aa832 |
Hashes for RecForest-0.1.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba9e55319e54a242527b345af7deef1d98ace4087a9dd7b8e4c44fac6cc7b1c4 |
|
MD5 | 4621dc32263deba48002a07fbb978f11 |
|
BLAKE2b-256 | 65a4d0174f1b6e4cdb3f4bef9d3851469198569874766f785f2de7a62dafda9b |
Hashes for RecForest-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 42d6243541b98f3ec4c2e93680689b606b2144853c0b30731b2f1ac74568b306 |
|
MD5 | 51e5a96d6cc1226c6b873481e7486a9a |
|
BLAKE2b-256 | 1b4ff87c621a7a8269569e33b361096ddf1f47d5d918d9a5ffc918cf0d4a594b |
Hashes for RecForest-0.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4beec717cdfdd9483d4da506f910e3582376c654c9bc345aa538969c600e22e1 |
|
MD5 | 666d1fd9f3978af29e3fe6836fbf079e |
|
BLAKE2b-256 | 9d0fdb37102805f764fbd3ff06c4989f2386c2b9ae39467ff5176dd370bcaefb |
Hashes for RecForest-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 865462add2483722dc3b281a2b0473ceefed443589f9e9a774b9738c77627a81 |
|
MD5 | 363b4b46e1e77da37b45a60191bc921e |
|
BLAKE2b-256 | 33bdd1431eff315e61e7039131b605575510b94bf735688136d3cb0e71738338 |
Hashes for RecForest-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 162570715e4fc1ccda5ebc97cfb8502d1405b873cf5b85448739c886999f1cbe |
|
MD5 | a7c4835ec588806c72ab23e80267e838 |
|
BLAKE2b-256 | 24c29fc728bd5fba1460082a4b8886fa3973b22c21362b490e995bddb705fe60 |
Hashes for RecForest-0.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1725d3291f081678ad8649aff0e88b62f880d10065d2ddc6656a33b9b458a22d |
|
MD5 | 944e8aa968eb17db5d84ae3979cb52e3 |
|
BLAKE2b-256 | ff0c6ffc6b8456a433e17c3bed01d7217d277bcc4f3aac2f789e9570e976a5de |