Using Nearest Neighbour-Variance Norm with Path Signatures for anomaly detection of streams
Project description
SigMahaKNN - Signature Mahalanobis KNN method
Anamoly detection on multivariate streams with Variance Norm and Path Signature
SigMahaKNN (signature_mahalanobis_knn
) combines the variance norm (a
generalisation of the Mahalanobis distance) with path signatures for anomaly
detection for multivariate streams. The signature_mahalanobis_knn
library is a
Python implementation of the SigMahaKNN method. The key contributions of this
library are:
- A simple and efficient implementation of the variance norm distance as
provided by the
signature_mahalanobis_knn.Mahalanobis
class. The class has two main methods:- The
fit
method to fit the variance norm distance to a training datase - The
distance
method to compute the distance between twonumpy
arraysx1
andx2
- The
- A simple and efficient implementation of the SigMahaKNN method as provided by
the
signature_mahalanobis_knn.SigMahaKNN
class. The class has two main methods:- The
fit
method to fit a model to a training dataset- The
fit
method can take in a corpus of streams as its input (where we will compute path signatures of using thesktime
library withesig
oriisignature
) or a corpus of path signatures as its input. This also opens up the possibility of using other feature represenations and applications of using the variance norm distance for anomaly detection - Currently, the library uses either
sklearn
'sNearestNeighbors
class orpynndescent
'sNNDescent
class to efficiently compute the nearest neighbour distances of a new data point to the corpus training data
- The
- The
conformance
method to compute the conformance score for a set of new data points- Similarly to the
fit
method, theconformance
method can take in a corpus of streams as its input (where we will compute path signatures of using thesktime
library withesig
oriisignature
) or a corpus of path signatures as its input
- Similarly to the
- The
Installation
The SigMahaKNN library is available on PyPI and can be installed with pip
:
pip install signature_mahalanobis_knn
Usage
As noted above, the signature_mahalanobis_knn
library has two main classes:
Mahalanobis
, a class for computing the variance norm distance, and
SigMahaKNN
, a class for computing the conformance score for a set of new data
points.
Computing the variance norm distance
Using the SigMahaKNN method for anomaly detection
Repo structure
The core implementation of the SigMahaKNN method is in the
src/signature_mahalanobis_knn
folder:
mahal_distance.py
contains the implementation of theMahalanobis
class to compute the variance norm distancesig_maha_knn.py
contains the implementation of theSigMahaKNN
class to compute the conformance scores for a set of new data points against a corpus of training datautils.py
contains some utility functions that are useful for the librarybaselines/
is a folder containing some of the baseline methods we look at in the paper - see paper-examples/README.md for more details
Examples
There are various examples in the examples
and paper-examples
folder:
examples
contains small examples using randomly generated data for illustration purposespaper-examples
contains the examples used in the paper (link available soon!) where we compare the SigMahaKNN method to other baseline approaches (e.g. Isolation Forest and Local Outlier Factor) on real-world datasets- There are notebooks for downloading and preprocessing the datasets for the examples - see paper-examples/README.md for more details
Contributing
To take advantage of pre-commit
, which will automatically format your code and
run some basic checks before you commit:
pip install pre-commit # or brew install pre-commit on macOS
pre-commit install # will install a pre-commit hook into the git repo
After doing this, each time you commit, some linters will be applied to format
the codebase. You can also/alternatively run pre-commit run --all-files
to run
the checks.
See CONTRIBUTING.md for more information on running the test
suite using nox
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for signature_mahalanobis_knn-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77269ee87648f4501e1fda3bb624d48ae137f707ad78df939a8709502dfa78f8 |
|
MD5 | 2f127969d62284bce6d629f9e8c74938 |
|
BLAKE2b-256 | 44093cfe7f41d1380d7666b81ad039fd343304866fddc0fa9928fbeace061b66 |
Hashes for signature_mahalanobis_knn-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9ac3436cf6529f61f6acf70acd6324d109e85ea39a59bcfe4593a14036a1315 |
|
MD5 | 7cd3ade5184ef06c928b5eb8dc5d663b |
|
BLAKE2b-256 | 59c34634a0686f1cddadf010a60ba5ea5cbf98e347849dc60b412f499a3973a0 |