Skip to main content

Implementation of Reconstruction-based Anomaly Detection with Completely Random Forest

Project description

This is the implementation of RecForest for anomaly detection, proposed in the paper “Reconstruction-based Anomaly Detection with Completely Random Forest,” SIAM International Conference on Data Mining (SDM), 2021. It is highly optimized and provides Scikit-Learn like APIs.

Installation

RecForest is available at PyPI:

$ pip install recforest

Build from Source

To use RecForest, you first need to install the package from source:

$ git clone https://github.com/xuyxu/RecForest.git
$ cd RecForest
$ python setup.py install

Notice that a C compiler is required to compile the pyx files (e.g., GCC on Linux, and MSVC on Windows). Please refer to Cython Installation for details.

Example

The code snippet below presents the minimal example on how to use RecForest for anomaly detection. Scripts on reproducing experiment results in the paper are available in the directory examples.

from recforest import RecForest
model = RecForest()
model.fit(X_train)
y_pred = model.predict(X_test)

Documentation

RecForest only has two hyper-parameters: n_estimators and max_depth. Docstrings on the input parameters are listed below.

  • n_estimators: Specify the number of decision trees in Recforest;

  • max_depth: Specify the maximum depth of decision trees in Recforest;

  • n_jobs: Specify the number of workers for joblib parallelization. -1 means using all processors;

  • random_state: Specify the random state for reproducibility.

RecForest has three public methods. Docstrings on these methods are listed below. Notice that for all methods, the accepted data format of input X is numpy array of the shape (n_samples, n_features).

  • fit(X): Fit a RecForest using the input data X;

  • apply(X): Return the leaf node ID of input data X in each decision tree;

  • predict(X): Return the anomaly score on the input data X.

Package Dependencies

  • numpy >= 1.13.3

  • scipy >= 0.19.1

  • joblib >= 0.12

  • cython >= 0.28.5

  • scikit-learn >= 0.22

A Python environment installed from conda is highly recommended. In this case, there is no need to install any package listed above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

RecForest-0.1.0-cp38-cp38-win_amd64.whl (76.2 kB view details)

Uploaded CPython 3.8 Windows x86-64

RecForest-0.1.0-cp38-cp38-manylinux2010_x86_64.whl (370.9 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

RecForest-0.1.0-cp38-cp38-manylinux1_x86_64.whl (370.9 kB view details)

Uploaded CPython 3.8

RecForest-0.1.0-cp37-cp37m-win_amd64.whl (75.1 kB view details)

Uploaded CPython 3.7m Windows x86-64

RecForest-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl (335.6 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

RecForest-0.1.0-cp37-cp37m-manylinux1_x86_64.whl (335.6 kB view details)

Uploaded CPython 3.7m

RecForest-0.1.0-cp36-cp36m-win_amd64.whl (75.1 kB view details)

Uploaded CPython 3.6m Windows x86-64

RecForest-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl (335.5 kB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

RecForest-0.1.0-cp36-cp36m-manylinux1_x86_64.whl (335.5 kB view details)

Uploaded CPython 3.6m

File details

Details for the file RecForest-0.1.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 76.2 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8324ff1188c4871f3d5877a638bfac10a79fef5ce7fdb668d553807b42521e9a
MD5 502b26d2ef6c38a42ad1e83fb7698e92
BLAKE2b-256 9821bc4faa118ea67a45897d9551087ea70152850ec372bb35217a2a89ed2f17

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 370.9 kB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 19416788bd2cc6ec73b65dd74a5bb23138606b8ad8c512e71bcdaffe6286658b
MD5 bcbd849caeabb316068cfe839c71c76c
BLAKE2b-256 9b1965c1fefb8024c497f4dbd400d6bb0774fd68bf9474efe9982b6980975f61

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 370.9 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a5c3b556982c45903740f239c98126fec4178e0289e02f83c00ff262a4578067
MD5 b3da7f60d6316e635cbc272a9ab59626
BLAKE2b-256 4bfb00f6a0095b4209416323f3fbccc4dcaa47488be76fe558593b94da4aa832

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 75.1 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 ba9e55319e54a242527b345af7deef1d98ace4087a9dd7b8e4c44fac6cc7b1c4
MD5 4621dc32263deba48002a07fbb978f11
BLAKE2b-256 65a4d0174f1b6e4cdb3f4bef9d3851469198569874766f785f2de7a62dafda9b

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 335.6 kB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 42d6243541b98f3ec4c2e93680689b606b2144853c0b30731b2f1ac74568b306
MD5 51e5a96d6cc1226c6b873481e7486a9a
BLAKE2b-256 1b4ff87c621a7a8269569e33b361096ddf1f47d5d918d9a5ffc918cf0d4a594b

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 335.6 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4beec717cdfdd9483d4da506f910e3582376c654c9bc345aa538969c600e22e1
MD5 666d1fd9f3978af29e3fe6836fbf079e
BLAKE2b-256 9d0fdb37102805f764fbd3ff06c4989f2386c2b9ae39467ff5176dd370bcaefb

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 75.1 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 865462add2483722dc3b281a2b0473ceefed443589f9e9a774b9738c77627a81
MD5 363b4b46e1e77da37b45a60191bc921e
BLAKE2b-256 33bdd1431eff315e61e7039131b605575510b94bf735688136d3cb0e71738338

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 335.5 kB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 162570715e4fc1ccda5ebc97cfb8502d1405b873cf5b85448739c886999f1cbe
MD5 a7c4835ec588806c72ab23e80267e838
BLAKE2b-256 24c29fc728bd5fba1460082a4b8886fa3973b22c21362b490e995bddb705fe60

See more details on using hashes here.

File details

Details for the file RecForest-0.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: RecForest-0.1.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 335.5 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.1.post20201107 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for RecForest-0.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 1725d3291f081678ad8649aff0e88b62f880d10065d2ddc6656a33b9b458a22d
MD5 944e8aa968eb17db5d84ae3979cb52e3
BLAKE2b-256 ff0c6ffc6b8456a433e17c3bed01d7217d277bcc4f3aac2f789e9570e976a5de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page