Skip to main content

An efficient aggregation based symbolic representation

Project description

fABBA

An efficient aggregation based symbolic representation for temporal data

License !pypi

fABBA is a fast and accurate symbolic representation methods, which allows for data compression and mining. By replacing the k-means clustering used in ABBA with a sorting-based aggregation technique, fABBA thereby avoid repeated within-cluster-sum-of-squares computations, and the computational complexity is significantly reduced. Also, in contrast to the ABBA, fABBA does not require the number of time series symbols to be specified in advance while achieves competing performance against ABBA and other symbolic methods.

Install

To install the current release

pip install fABBA

Apply series compression

>>> import numpy as np
>>> from fABBA.symbolic_representation import fabba_model
>>> np.random.seed(1)
>>> N = 100
>>> ts = np.random.rand(N)
>>> fabba = fabba_model(tol=0.1, alpha=0.1, sorting='lexi', scl=1, verbose=1, max_len=np.inf, string_form=True)
>>> print(fabba)
fABBA({'_alpha': 0.5, '_sorting': '2-norm', '_tol': 0.1, '_scl': 1, '_verbose': 1, '_max_len': inf, '_string_form': True, '_n_jobs': 1})

>>> string = fabba.fit_transform(ts)
>>> print(string)
"-#!.%&/#0'"12(#34$&%5!67)$*(+8*9:";!<'+=>!)$?@A!B!

>>> inverse_ts = fabba.inverse_transform(symbolic_tsf, ts[0]) # reconstructed time series

Plot the image

>>> plt.plot(ts, label='time series', c='olive')
>>> plt.plot(inverse_ts, label='reconstruction', c='darkblue')
>>> plt.legend()
>>> plt.grid(True, axis='y')
>>> plt.show()

Apply adaptively polygonal chian approximation

>>> from fABBA.chainApproximation import compress
>>> from fABBA.chainApproximation import inverse_compress
>>> np.random.seed(1)
>>> N = 100
>>> ts = np.random.rand(N)
>>> pieces = compress(ts, tol=0.1)
>>> inverse_ts = inverse_compress(pieces, ts[0])

Apply aggregated digitization

>>> from fABBA.digitization import digitize
>>> from fABBA.digitization import inverse_digitize
>>> string, parameters = digitize(pieces, alpha=0.1, sorting='2-norm', scl=1) # pieces from aforementioned compression
>>> print(''.join(string))
,"-#!.%&/#0'"12(#34$&%5!67)$*(+8*9:";!<'+=>!)$?@A!B!

>>> inverse_pieces = inverse_digitize(string, parameters)
>>> inverse_ts = inverse_compress(inverse_pieces, ts[0])

Image compression

>>> import matplotlib.pyplot as plt
>>> from fABBA.load_datasets import load_images
>>> from fABBA.symbolic_representation import image_compress
>>> from fABBA.symbolic_representation import image_decompress
>>> from fABBA.symbolic_representation import fabba_model
>>> from cv2 import resize
>>> img_samples = load_images(shape=(100,100)) # load fABBA image test samples
>>> img = resize(img_samples[0], (100, 100)) # select the first image for test
>>> fabba = fabba_model(tol=0.1, alpha=0.01, sorting='2-norm', scl=1, verbose=1, max_len=np.inf, string_form=True)
>>> strings = image_compress(fabba, img)
>>> inverse_img = image_decompress(fabba, strings)

Plot the original image

>>> plt.imshow(img)
>>> plt.show()

Plot the reconstructed image

>>> plt.imshow(inverse_img)
>>> plt.show()

Experiment

This repository named "experiments" contains all code required to reproduce the experiments in the manuscript "An efficient aggregation method for the symbolic representation of temporal data".

Overview and dependencies

The "experiments" folder is self-contained, covering all scripts to reproduce the experimental data except UCRArchive2018 in the paper.

The UCRArchive2018 datasets can be downloaded from UCR Time Series Classification Archive.

There are a number of dependencies listed below. Most of these modules, except perhaps the final ones, are part of any standard Python installation. We list them for completeness:

os, csv, time, pickle, numpy, warnings, matplotlib, math, collections, copy, sklearn, pandas, tqdm, tslearn

Please ensure that these modules are available before running the codes.

Software Contributor

Stefan Guettel stefan.guettel@manchester.ac.uk

Xinye Chen xinye.chen@manchester.ac.uk

Reference

If you have used this software in a scientific publication and wish to cite it, please use the following citation.

    @article{fABBAarticle,
      title={An efficient aggregation method for the symbolic representation of temporal data},
      author={Xinye, Chen and Guettel, Stefan},
      journal={},
      volume={},
      number={},
      pages={},
      year={2021}
    }

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fABBA-0.2.2.tar.gz (317.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page