Skip to main content

outlier_detection, detects outliers.

Project description

Build Status

Outlier Data Detection Systems - ODDS

As used in paper "Simple Models are Effective in Anomaly Detection in Multi-variate Time Series"

This work was funded by the grant "Early detection of contact distress for enhanced performance monitoring and predictive inspection of machines" (EP/S005463/1) from the Engineering and Physical Sciences Research Council (EPSRC), UK.

pip install odds

The work is done by the OD object. Import the 'OD' object as follows:

from odds import OD

Instantiate the object with the 'algo' argument, where a short string represents the algorithm you wish to use. In this case, 'VAR' refers to vector autoregression, a simple linear multidimensional regression algorithm. Other implemented algorithms are listed below.

od = OD('VAR')

To use the object, you need to call the 'get_os()' function, with 'X' as its argument, where X is a data matrix, n samples by p features. p must be 2 or greater to work on many of the systems, this returns a vector with n scores, one for each sample.

outlier_scores = od.get_os(X)

The higher scores are the more outlying. you can then set a threshold if you wish, or just look at the ranking. Scores have not been sanitised, they may contain 'nan' values particularly from the 'VAE' if the data input has not been scaled. However it seems other algorithms work better without scaling, so inputs are not automatically scaled.

Hyperparameters for each of these algorithms are currently fixed to the values in my paper, however at some point I will be finishing implementing a pass though allowing you to specify the hyperparameters at instantiation. This is on my ToDo list.

To get normalised (between 0 and 1) scores, use the 'norm' keyword argument. This may result in errors if the data is not normalised, as there may be infinite values in the scores (usually only from the 'VAE').

normalised_scores = od.get_os(X, norm=True)

Valid strings for outlier algorithms:

  • 'VAR' Vector Autoregression
  • 'FRO' Ordinary Feature Regression
  • 'FRL' LASSO Feature Regression
  • 'FRR' Ridge Feature Regression
  • 'GMM' Gaussian Mixture model
  • 'IF' Isolation Forest
  • 'DBSCAN' Density Based Spatial Clustering and Noise
  • 'OCSVM' One Class Support Vector Machine
  • 'LSTM' Long Short Term Memory
  • 'GRU' Gated Recurrent Unit
  • 'AE' Autoencoder
  • 'VAE' Variational Autoencoder
  • 'OP' Outlier Pursuit
  • 'GOP' Graph Regularised Outlier Pursuit
  • 'RAND' Random scoring (for baseline comparison)

Hyperparameter table Hyperparameter table

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

odds-0.2.2.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

odds-0.2.2-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file odds-0.2.2.tar.gz.

File metadata

  • Download URL: odds-0.2.2.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for odds-0.2.2.tar.gz
Algorithm Hash digest
SHA256 1453d914016a861b6e70f1c5094388ea268726e8b0ee7c922657da4fb885f40c
MD5 0ac51610f4f6c97ce19464c10cfa29eb
BLAKE2b-256 4ab970317d44d77a2865bf8ed79d73cadb3b2d0661f23e4887ebb1a555203114

See more details on using hashes here.

File details

Details for the file odds-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: odds-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for odds-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9b94ecdf73cc301cf4777f2e3094198b05dafa1663510de40d314396fd40a8b6
MD5 e33ed6dbe3dc74b80203109276512030
BLAKE2b-256 9bcf20b5a1641f88929eea9855977e6b77d2a01e2343664fdbbdcb75610f1782

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page