Skip to main content

Using Nearest Neighbour-Variance Norm with Path Signatures for anomaly detection of streams

Project description

SigMahaKNN - Signature Mahalanobis KNN method

Anamoly detection on multivariate streams with Variance Norm and Path Signature

Actions Status Documentation Status PyPI version PyPI platforms GitHub Discussion

SigMahaKNN (signature_mahalanobis_knn) combines the variance norm (a generalisation of the Mahalanobis distance) with path signatures for anomaly detection for multivariate streams. The signature_mahalanobis_knn library is a Python implementation of the SigMahaKNN method. The key contributions of this library are:

  • A simple and efficient implementation of the variance norm distance as provided by the signature_mahalanobis_knn.Mahalanobis class. The class has two main methods:
    • The fit method to fit the variance norm distance to a training datase
    • The distance method to compute the distance between two numpy arrays x1 and x2
  • A simple and efficient implementation of the SigMahaKNN method as provided by the signature_mahalanobis_knn.SigMahaKNN class. The class has two main methods:
    • The fit method to fit a model to a training dataset
      • The fit method can take in a corpus of streams as its input (where we will compute path signatures of using the sktime library with esig or iisignature) or a corpus of path signatures as its input. This also opens up the possibility of using other feature represenations and applications of using the variance norm distance for anomaly detection
      • Currently, the library uses either sklearn's NearestNeighbors class or pynndescent's NNDescent class to efficiently compute the nearest neighbour distances of a new data point to the corpus training data
    • The conformance method to compute the conformance score for a set of new data points
      • Similarly to the fit method, the conformance method can take in a corpus of streams as its input (where we will compute path signatures of using the sktime library with esig or iisignature) or a corpus of path signatures as its input

Installation

The SigMahaKNN library is available on PyPI and can be installed with pip:

pip install signature_mahalanobis_knn

Usage

As noted above, the signature_mahalanobis_knn library has two main classes: Mahalanobis, a class for computing the variance norm distance, and SigMahaKNN, a class for computing the conformance score for a set of new data points.

Computing the variance norm distance

Using the SigMahaKNN method for anomaly detection

Repo structure

The core implementation of the SigMahaKNN method is in the src/signature_mahalanobis_knn folder:

  • mahal_distance.py contains the implementation of the Mahalanobis class to compute the variance norm distance
  • sig_maha_knn.py contains the implementation of the SigMahaKNN class to compute the conformance scores for a set of new data points against a corpus of training data
  • utils.py contains some utility functions that are useful for the library
  • baselines/ is a folder containing some of the baseline methods we look at in the paper - see paper-examples/README.md for more details

Examples

There are various examples in the examples and paper-examples folder:

  • examples contains small examples using randomly generated data for illustration purposes
  • paper-examples contains the examples used in the paper (link available soon!) where we compare the SigMahaKNN method to other baseline approaches (e.g. Isolation Forest and Local Outlier Factor) on real-world datasets
    • There are notebooks for downloading and preprocessing the datasets for the examples - see paper-examples/README.md for more details

Contributing

To take advantage of pre-commit, which will automatically format your code and run some basic checks before you commit:

pip install pre-commit  # or brew install pre-commit on macOS
pre-commit install  # will install a pre-commit hook into the git repo

After doing this, each time you commit, some linters will be applied to format the codebase. You can also/alternatively run pre-commit run --all-files to run the checks.

See CONTRIBUTING.md for more information on running the test suite using nox.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signature_mahalanobis_knn-0.1.0.tar.gz (5.8 MB view details)

Uploaded Source

Built Distribution

File details

Details for the file signature_mahalanobis_knn-0.1.0.tar.gz.

File metadata

File hashes

Hashes for signature_mahalanobis_knn-0.1.0.tar.gz
Algorithm Hash digest
SHA256 77269ee87648f4501e1fda3bb624d48ae137f707ad78df939a8709502dfa78f8
MD5 2f127969d62284bce6d629f9e8c74938
BLAKE2b-256 44093cfe7f41d1380d7666b81ad039fd343304866fddc0fa9928fbeace061b66

See more details on using hashes here.

File details

Details for the file signature_mahalanobis_knn-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for signature_mahalanobis_knn-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b9ac3436cf6529f61f6acf70acd6324d109e85ea39a59bcfe4593a14036a1315
MD5 7cd3ade5184ef06c928b5eb8dc5d663b
BLAKE2b-256 59c34634a0686f1cddadf010a60ba5ea5cbf98e347849dc60b412f499a3973a0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page