Skip to main content

Longest Common Subsequence (LCSS) computation to measure similarity for time series.

Project description

Introduction

lcsspy is a Python package to compute the longest common subsequence (also known as LCSS) similarity measure for time series data.

LCSS reconstructs a common subsequence by matching similar elements in the two series. Two elements are matched if they are sufficiently close in time and also have similar values.

Getting Started

Installation

lcsspy supports any Python version starting from 3.9, and is OS independent.

Regular installation can be performed by running the following command.

pip install lcsspy

This will install lcsspy and its required dependencies.

If you want, you can install pytest as an optional dependency by running a slightly different command.

pip install lcsspy[tests]

This allows you to run existing tests, or write your own. Refer to the pytest documentation to learn more about running tests.

Usage

Discrete LCSS

Discrete LCSS measures similarity between time series with discrete time indexes. It's the formulation introduced by Vlachos.

The following script matches elements that have a value difference smaller than 1.1, and that are at most 1 index position apart.

import matplotlib.pyplot as plt
import numpy as np
from lcsspy.lcss import discrete_lcss

ts1 = np.array([1, 2, 2, 9, 14, 11, 19, 18])
ts2 = np.array([2, 1, 4, 7, 10, 15, 12, 8, 17])
result = discrete_lcss(ts1=ts1, ts2=ts2, epsilon=1.1, delta=1, plot=True)
print(result.lcss_measure)
plt.show()

Running this code displays two figures. The first one plots the input time series and signals which elements were matched with a green line.

Discrete LCSS Series Plot

The second figure plots only the elements from the input series that are part of the common subsequence. In this case, the common subsequence has length 6 since it contains six pairs of elements that were matched.

Discrete LCSS Sequence Plot

This value is divided by the length of the shortest series to obtain the LCSS measure which belongs to the range $[0, 1]$.

The measure equals $6/8$, which is printed to the console.

0.75

Continuous LCSS

Continuous LCSS deals with time series that have continuous time indexes (timestamps). It's particularly useful when the two series are very irregular and present many gaps.

In the following example, elements are matched if they have a value difference smaller than 0.9, and their timestamps are at most 1 minute apart.

import matplotlib.pyplot as plt
import pandas as pd
from lcsspy.lcss import continuous_lcss

ts1 = pd.Series(
    [12.4, 13.7, 15.8, 8.7],
    index=pd.DatetimeIndex(
        [
            "2023-11-17 08:42:23",
            "2023-11-17 08:43:35",
            "2023-11-17 08:45:06",
            "2023-11-17 08:50:23",
        ]
    ),
)

ts2 = pd.Series(
    [13.2, 13.0, 19.0, 9.0, 9.2],
    index=pd.DatetimeIndex(
        [
            "2023-11-17 08:42:39",
            "2023-11-17 08:44:02",
            "2023-11-17 08:45:32",
            "2023-11-17 08:49:37",
            "2023-11-17 08:51:12",
        ]
    ),
)

result = continuous_lcss(
    ts1=ts1, ts2=ts2, epsilon=0.9, delta=pd.Timedelta(minutes=1), plot=True
)

print(result.lcss_measure)
plt.show()

The LCSS measure equals $3/4$ and similar plots to those concerning the discrete LCSS example are displayed.

Continuous LCSS Series Plot

Continuous LCSS Sequence Plot

Refer to the documentation for more details.

Testing

This package uses the pytest framework to run tests. A test folder which achieves 100% code coverage is provided.

Copyright and License

All source code is Copyright (c) 2023 Francesco Lafratta.

lcsspy is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lcsspy-1.0.1.tar.gz (219.9 kB view details)

Uploaded Source

Built Distribution

lcsspy-1.0.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file lcsspy-1.0.1.tar.gz.

File metadata

  • Download URL: lcsspy-1.0.1.tar.gz
  • Upload date:
  • Size: 219.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for lcsspy-1.0.1.tar.gz
Algorithm Hash digest
SHA256 98c317bc798e685738e364252291696f8962c20ea986ca4ffd81047997f1d6be
MD5 741fa3dcfbe2d34012be40a493d326b4
BLAKE2b-256 448f347db56098e4269032f763012ed1b1d3fed946c0b096bb5e939f9eb4c98b

See more details on using hashes here.

File details

Details for the file lcsspy-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: lcsspy-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for lcsspy-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 32b3b1c5ff348a75b8a3de730e9943f0ad0bbc0dbb257215fc2a2bb574b1540a
MD5 2ed3e17264897db42f40e7fdcb005538
BLAKE2b-256 2354ed86f7c003314d6a2485c1beca8265451d5dca76fc8bfb37e264cd0abebb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page