Behaviour Extraction for Time-Series Investigation
Project description
betsi
Behaviour Extraction for Time-Series Investigation using Deep Learning
Deeplearning module for event detection in time series through behavioural extraction.
What is it all about?
Anomalies are any events outside the nominal behaviour of a system. They are sometimes characterised by sudden large swings in the values of a particular parameter and sometimes are distributed changes across multiple parameters and can be catastrophic to the system. Traditionally, anomalies were detected by monitoring each parameter, superimposing the graphs for each sub-system and finding "out-of-normal" behaviour manually.
This project aims to implement a state of the art model based on the paper "Time Series Segmentation through Automatic Feature Learning" by Lee, Wei-Han, et al. to automatically detect anomalies without any manual intervention. To do so, the project provides tools to train deep-learning models on Tensorflow and run through post-processing "prediction" steps which use the condensed representations produced by the deep-learning model to detect change in behaviour and further on anomalous events.
The project mainly has three steps which ensure good performance of the anomaly detection.
-
Preprocessing: The input timeseries data (resampled version of some continuous data) is first normalized and then grouped into sets of
window_size
timesteps separated by astride
movement in time. What this essentially does is increase the amount of data we have and allow us to capture interactions between continuous timesteps.Consider a case where you have 11 sensors providing readings. If you take a
window_size
of 3, readings at time index 1, 2, and 3 for all 11 sensors will be stack in one vector. That means that you will have 33 columns at every timestep. If your stride is 2, your second group (vector) would start at 1+2=3rd timestep (3, 4, 5), third group at 5th timestep (5, 6, 7) and so on. -
Model: The model we use to create the concise representation is called an autoencoder. An autoencoder is a neural network whose inputs and outputs are the exact same values but the intermediate layers gradually reduce the amount of information. The input data is compressed to the smallest representation possible, at the bottleneck layer, that still permits the following layers to reproduce the original input with good fidelity. It is essentially a model with an encoder (input -> bottleneck layer) which "encodes" the data into a smaller representation and a decoder (bottleneck -> output) which "decodes" the encoded representation to get the original data.
-
Predictors: The predictions from the bottleneck layer are then compared across multiple rows (groups of timestamps) with the help of a distance (single value) using L2 norm (which calculates the square of difference in values). The group of distances are then compared against their average to detect events (possible anomaly). A key parameter here is the
threshold
ornoise_margin_per
which defines how much above the averge the distance needs to be to be called an event. This helps filter out random fluctuations in data (since the distance value is not a constant for all nominal cases)
Project Structure
src/
betsi/
models.py // To build and train the Tensorflow model
predictors.py // To predict/detect anomalies based on the output from the Tensorflow model
preprocessors.py // To preprocess the input data through normalization and filtering
tests/
test_models.py // Test for models
test_predictors.py // Test for predictors
test_preprocessors.py // Test for preprocessors
Installation through pip
pip install betsi-ml
It is recommended that you install the project in a virtual environment as it is still under development.
To create a virtual environment and install in it, run:
python -m venv .venv
source .venv/bin/activate
pythom -m pip install betsi-ml
Installation from source
# Clone from source
$ git clone https://gitlab.com/librespacefoundation/polaris/betsi.git
# Switch to the directory
$ cd betsi
# Create and switch to a virtual environment
$ python3 -m venv .venv
$ source .venv/bin/activate
# To install a non-editable version
(.venv) $ python3 setup.py install
# To install an editable version
(.venv) $ python3 -m pip install -e .
Usage
Preprocessing the input data
(.venv) $ python3
>>> from betsi import models, preprocessors, predictors
# To apply preprocessing on data
# Step 1: Normalize the data
>>> normalizer, normalized_data = preprocessors.normalize_all_data(data)
# Step 2: Convert it to columns using fixed stride and window size
>>> converted_data = preprocessors.convert_to_column(normalized_data, window_size=3, stride=2)
A few remarks regarding preprocessing:
normalizer is an instance of sklearn transformer
. It has an inverse method, normalizer.inverse_transform
, used to "un-normalize" the data!
preprocessors also has a convert_from_column
method to undo the change made by convert_to_column
.
A combination of these two methods can be used to remove the preprocessing from the data (or from the model predictions) as follows:
# To remove preprocessing from data
>>> normalized_data = preprocessors.convert_from_column(converted_data, window_size=3, stride=2)
>>> recovered_data = normalizer.inverse_transform(converted_data)
Creating the model
Before we create a model, we need to decide the structure of the auto-encoder, i.e. the layer sizes, activations for each layer.
Since the architecture is symmetric and the number of layers (n) are odd (assumed),
we only need to specify the layer dims for the first (n+1)/2
layers.
The activation for the last n-1
layers (barring the first input layer) need to be
specified. If the activations are not specified, it is assumed to be ReLU for every
layer.
This is summed up with a simple diagram:
layer_dims[0]
o o
o o layer_dims[-1] o o
o o o o o
o o o o o
o o o o
o activations[0] o
activations[-1]
You also need to decide on your optimizer ("adam"
is preferred), the loss
("mean_squared_error"
) and metrics to monitor ("[MSE]"
)
The python code (assuming you have the layer_dims
and activations
variables
ready, have preprocessed your data to get converted_data
and have decided on your
optimizer
, loss
and metrics
) to create and train the model is as follows:
# ae_model = auto_encoder_model
# en_model = encoder_model
# de_model = decoder_model
# Both the encoder and decoder models are extracted from the autoecoder
# model and need not be trained separately.
>>> ae_model, en_model, de_model = models.custom_autoencoder(layer_dims, activations=activations)
# Compile the model for training
>>> ae_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
>>> en_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
>>> de_model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
>>> train_data, test_data = train_test_split(
converted_data,
test_size=0.33, # 33% of data is for testing
shuffle=False, # We disable shuffling since order matters (time)
)
# You can also play around with the batch_size and epochs and enable
# early_stopping based on your needs
>>> history = ae_model.fit(train_data, train_data, batch_size=32, epochs=20)
# To test the model to check if it has overfit, you can run:
>>> ae_model.evaluate(test_data, test_data, batch_size=32)
# This will return the test loss and the metrics which can be compared against
# the training values for the same
Predicting anomalies
Now that we have our trained model and its input data (along with the normalizer, window_size and stride to create new input data whernever we want), we can now predict anomalies.
# Step 1: Predict the "bottleneck" layer representation for the input data
>>> data_rep = encoder_model.predict(test_data)
# Step 2: Measure the distance between consecutive timestamps. This will be
# the metric to detect anomalies. Distance here refers to the L2 norm.
>>> distance_list = []
>>> for row_no in range(data_rep.shape[0] - 1):
>>> distance_list.append(
>>> distance_measure(data_rep[row_no], data_rep[row_no + 1]))
# Step 3: Detect the events. We have a noise_margin_per variable to say
# how much (in percentage) should the value be above the average to be
# considered the event. Try playing around with this to find the best value!
>>> noise_margin_per = 150 # 2.5 x the average ie 150% more than average
>>> events = get_events(distance_list, threshold=noise_margin_per)
# events contains all the indices where events occurred
Reference
This project is based on the paper ‘Time Series Segmentation through Automatic Feature Learning’ by Lee, Wei-Han, et al.
Refer: ArXiv:1801.05394 [Cs, Stat], Jan. 2018. arXiv.org, http://arxiv.org/abs/1801.05394
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file betsi_ml-0.0.4-py2.py3-none-any.whl
.
File metadata
- Download URL: betsi_ml-0.0.4-py2.py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.7.0 requests/2.25.1 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a807f0189eb433076cdf6d25b03c57b8c5322a710b4257007500508876676dcf |
|
MD5 | c91b705cdfce6ed442eb4a19ffe6a857 |
|
BLAKE2b-256 | 973b7e29d3f0c7fda2509926aa5a25d6c882fd40120ce2c6e4b94a5b44da04b2 |