Skip to main content

Behaviour Extraction for Time-Series Investigation

Project description

betsi

pipeline status coverage report


Behaviour Extraction for Time-Series Investigation using Deep Learning

Deeplearning module for event detection in time series through behavioural extraction.


What is it all about?

Anomalies are any events outside the nominal behaviour of a system. They are sometimes characterised by sudden large swings in the values of a particular parameter and sometimes are distributed changes across multiple parameters and can be catastrophic to the system. Traditionally, anomalies were detected by monitoring each parameter, superimposing the graphs for each sub-system and finding "out-of-normal" behaviour manually.

This project aims to implement a state of the art model based on the paper "Time Series Segmentation through Automatic Feature Learning" by Lee, Wei-Han, et al. to automatically detect anomalies without any manual intervention. To do so, the project provides tools to train deep-learning models on Tensorflow and run through post-processing "prediction" steps which use the condensed representations produced by the deep-learning model to detect change in behaviour and further on anomalous events.

The project mainly has three steps which ensure good performance of the anomaly detection.

  1. Preprocessing: The input timeseries data (resampled version of some continuous data) is first normalized and then grouped into sets of window_size timesteps separated by a stride movement in time. What this essentially does is increase the amount of data we have and allow us to capture interactions between continuous timesteps.

    Consider a case where you have 11 sensors providing readings. If you take a window_size of 3, readings at time index 1, 2, and 3 for all 11 sensors will be stack in one vector. That means that you will have 33 columns at every timestep. If your stride is 2, your second group (vector) would start at 1+2=3rd timestep (3, 4, 5), third group at 5th timestep (5, 6, 7) and so on.

  2. Model: The model we use to create the concise representation is called an autoencoder. An autoencoder is a neural network whose inputs and outputs are the exact same values but the intermediate layers gradually reduce the amount of information. The input data is compressed to the smallest representation possible, at the bottleneck layer, that still permits the following layers to reproduce the original input with good fidelity. It is essentially a model with an encoder (input -> bottleneck layer) which "encodes" the data into a smaller representation and a decoder (bottleneck -> output) which "decodes" the encoded representation to get the original data.

  3. Predictors: The predictions from the bottleneck layer are then compared across multiple rows (groups of timestamps) with the help of a distance (single value) using L2 norm (which calculates the square of difference in values). The group of distances are then compared against their average to detect events (possible anomaly). A key parameter here is the threshold or noise_margin_per which defines how much above the averge the distance needs to be to be called an event. This helps filter out random fluctuations in data (since the distance value is not a constant for all nominal cases)


Project Structure

src/
    betsi/
        models.py // To build and train the Tensorflow model
        predictors.py // To predict/detect anomalies based on the output from the Tensorflow model
        preprocessors.py // To preprocess the input data through normalization and filtering
tests/
    test_models.py // Test for models
    test_predictors.py // Test for predictors
    test_preprocessors.py // Test for preprocessors

Installation through pip

pip install betsi-ml

It is recommended that you install the project in a virtual environment as it is still under development.

To create a virtual environment and install in it, run:

python -m venv .venv
source .venv/bin/activate
pythom -m pip install betsi-ml

Installation from source

# Clone from source
$ git clone https://gitlab.com/librespacefoundation/polaris/betsi.git

# Switch to the directory
$ cd betsi

# Create and switch to a virtual environment
$ python3 -m venv .venv
$ source .venv/bin/activate

# To install a non-editable version
(.venv) $ python3 setup.py install

# To install an editable version
(.venv) $ python3 -m pip install -e .

Usage

Preprocessing the input data

(.venv) $ python3
>>> from betsi import models, preprocessors, predictors

# To apply preprocessing on data
# Step 1: Normalize the data
>>> normalizer, normalized_data = preprocessors.normalize_all_data(data)
# Step 2: Convert it to columns using fixed stride and window size
>>> converted_data = preprocessors.convert_to_column(normalized_data, window_size=3, stride=2)

A few remarks regarding preprocessing:

normalizer is an instance of sklearn transformer. It has an inverse method, normalizer.inverse_transform, used to "un-normalize" the data!

preprocessors also has a convert_from_column method to undo the change made by convert_to_column.

A combination of these two methods can be used to remove the preprocessing from the data (or from the model predictions) as follows:

# To remove preprocessing from data
>>> normalized_data = preprocessors.convert_from_column(converted_data, window_size=3, stride=2)
>>> recovered_data = normalizer.inverse_transform(converted_data)

Creating the model

Before we create a model, we need to decide the structure of the auto-encoder, i.e. the layer sizes, activations for each layer.

Since the architecture is symmetric and the number of layers (n) are odd (assumed), we only need to specify the layer dims for the first (n+1)/2 layers.

The activation for the last n-1 layers (barring the first input layer) need to be specified. If the activations are not specified, it is assumed to be ReLU for every layer.

This is summed up with a simple diagram:

layer_dims[0]
    o                            o
    o    o  layer_dims[-1]  o    o
    o    o       o          o    o
    o    o       o          o    o
    o    o                  o    o
    o activations[0]             o
                                activations[-1]

You also need to decide on your optimizer ("adam" is preferred), the loss ("mean_squared_error") and metrics to monitor ("[MSE]")

The python code (assuming you have the layer_dims and activations variables ready, have preprocessed your data to get converted_data and have decided on your optimizer, loss and metrics) to create and train the model is as follows:

# ae_model = auto_encoder_model
# en_model = encoder_model
# de_model = decoder_model
# Both the encoder and decoder models are extracted from the autoecoder
# model and need not be trained separately.
>>> ae_model, en_model, de_model = models.custom_autoencoder(layer_dims, activations=activations)
# Compile the model for training
>>> ae_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
>>> en_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
>>> de_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

>>> train_data, test_data = train_test_split(
        converted_data,
        test_size=0.33, # 33% of data is for testing
        shuffle=False, # We disable shuffling since order matters (time)
    )

# You can also play around with the batch_size and epochs and enable
# early_stopping based on your needs
>>> history = ae_model.fit(train_data, train_data, batch_size=32, epochs=20)

# To test the model to check if it has overfit, you can run:
>>> ae_model.evaluate(test_data, test_data, batch_size=32)
# This will return the test loss and the metrics which can be compared against
# the training values for the same

Predicting anomalies

Now that we have our trained model and its input data (along with the normalizer, window_size and stride to create new input data whernever we want), we can now predict anomalies.

# Step 1: Predict the "bottleneck" layer representation for the input data
>>> data_rep = encoder_model.predict(test_data)

# Step 2: Measure the distance between consecutive timestamps. This will be
# the metric to detect anomalies. Distance here refers to the L2 norm.
>>> distance_list = []
>>> for row_no in range(data_rep.shape[0] - 1):
>>>     distance_list.append(
>>>         distance_measure(data_rep[row_no], data_rep[row_no + 1]))

# Step 3: Detect the events. We have a noise_margin_per variable to say
# how much (in percentage) should the value be above the average to be
# considered the event. Try playing around with this to find the best value!
>>> noise_margin_per = 150 # 2.5 x the average ie 150% more than average
>>> events = get_events(distance_list, threshold=noise_margin_per)
# events contains all the indices where events occurred

Reference

This project is based on the paper ‘Time Series Segmentation through Automatic Feature Learning’ by Lee, Wei-Han, et al.

Refer: ArXiv:1801.05394 [Cs, Stat], Jan. 2018. arXiv.org, http://arxiv.org/abs/1801.05394

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

betsi_ml-0.0.4-py2.py3-none-any.whl (13.5 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page