Skip to main content

Distance measures for time series (Dynamic Time Warping, fast C implementation)

Project description

PyPi Version Conda Version Documentation Status DOI

Time Series Distances

Library for time series distances (e.g. Dynamic Time Warping) used in the DTAI Research Group. The library offers a pure Python implementation and a fast implementation in C. The C implementation has only Cython as a dependency. It is compatible with Numpy and Pandas and implemented such that unnecessary data copy operations are avoided.

Documentation: http://dtaidistance.readthedocs.io

Example:

from dtaidistance import dtw
import numpy as np
s1 = np.array([0.0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2)

Citing this work:

Wannes Meert, Kilian Hendrickx, Toon Van Craenendonck, Pieter Robberechts, Hendrik Blockeel & Jesse Davis.
DTAIDistance (Version v2). Zenodo.
http://doi.org/10.5281/zenodo.5901139

New in v2:

  • Numpy is now an optional dependency, also to compile the C library (only Cython is required).
  • Small optimizations throughout the C code to improve speed.
  • The consistent use of ssize_t instead of int allows for larger data structures on 64 bit machines and be more compatible with Numpy.
  • The parallelization is now implemented directly in C (included if OpenMP is installed).
  • The max_dist argument turned out to be similar to Silva and Batista's work on PrunedDTW [7]. The toolbox now implements a version that is equal to PrunedDTW since it prunes more partial distances. Additionally, a use_pruning argument is added to automatically set max_dist to the Euclidean distance, as suggested by Silva and Batista, to speed up the computation (a new method ub_euclidean is available).
  • Support in the C library for multi-dimensional sequences in the dtaidistance.dtw_ndim package.
  • DTW Barycenter Averaging for clustering (v2.2).
  • Subsequence search and local concurrences (v2.3).
  • Support for N-dimensional time series (v2.3.7).

Installation

$ pip install dtaidistance

or

$ conda install -c conda-forge dtaidistance

The pip installation requires Numpy as a dependency to compile Numpy-compatible C code (using Cython). However, this dependency is optional and can be removed.

The source code is available at github.com/wannesm/dtaidistance.

If you encounter any problems during compilation (e.g. the C-based implementation or OpenMP is not available), see the documentation for more options.

Usage

Dynamic Time Warping (DTW) Distance Measure

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import numpy as np
s1 = np.array([0., 0, 1, 2, 1, 0, 1, 0, 0, 2, 1, 0, 0])
s2 = np.array([0., 1, 2, 3, 1, 0, 0, 0, 2, 1, 0, 0, 0])
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path, filename="warp.png")

Dynamic Time Warping (DTW) Example

DTW Distance Measure Between Two Series

Only the distance measure based on two sequences of numbers:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance = dtw.distance(s1, s2)
print(distance)

The fastest version (30-300 times) uses c directly but requires an array as input (with the double type), and (optionally) also prunes computations by setting max_dist to the Euclidean upper bound:

from dtaidistance import dtw
import array
s1 = array.array('d',[0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = array.array('d',[0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Or you can use a numpy array (with dtype double or float):

from dtaidistance import dtw
import numpy as np
s1 = np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double)
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Check the __doc__ for information about the available arguments:

print(dtw.distance.__doc__)

A number of options are foreseen to early stop some paths the dynamic programming algorithm is exploring or tune the distance measure computation:

  • window: Only allow for shifts up to this amount away from the two diagonals.
  • max_dist: Stop if the returned distance measure will be larger than this value.
  • max_step: Do not allow steps larger than this value.
  • max_length_diff: Return infinity if difference in length of two series is larger.
  • penalty: Penalty to add if compression or expansion is applied (on top of the distance).
  • psi: Psi relaxation to ignore begin and/or end of sequences (for cylical sequences) [2].
  • use_pruning: Prune computations based on the Euclidean upper bound.

DTW Distance Measure all warping paths

If, next to the distance, you also want the full matrix to see all possible warping paths:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance, paths = dtw.warping_paths(s1, s2)
print(distance)
print(paths)

The matrix with all warping paths can be visualised as follows:

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import random
import numpy as np
x = np.arange(0, 20, .5)
s1 = np.sin(x)
s2 = np.sin(x - 1)
random.seed(1)
for idx in range(len(s2)):
    if random.random() < 0.05:
        s2[idx] += (random.random() - 0.5) / 2
d, paths = dtw.warping_paths(s1, s2, window=25, psi=2)
best_path = dtw.best_path(paths)
dtwvis.plot_warpingpaths(s1, s2, paths, best_path)

DTW Example

Notice the psi parameter that relaxes the matching at the beginning and end. In this example this results in a perfect match even though the sine waves are slightly shifted.

DTW Distance Measures Between Set of Series

To compute the DTW distance measures between all sequences in a list of sequences, use the method dtw.distance_matrix. You can set variables to use more or less c code (use_c and use_nogil) and parallel or serial execution (parallel).

The distance_matrix method expects a list of lists/arrays:

from dtaidistance import dtw
import numpy as np
series = [
    np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double),
    np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0]),
    np.array([0.0, 0, 1, 2, 1, 0, 0, 0])]
ds = dtw.distance_matrix_fast(series)

or a matrix (in case all series have the same length):

from dtaidistance import dtw
import numpy as np
series = np.matrix([
    [0.0, 0, 1, 2, 1, 0, 1, 0, 0],
    [0.0, 1, 2, 0, 0, 0, 0, 0, 0],
    [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])
ds = dtw.distance_matrix_fast(series)

DTW Distance Measures Between Set of Series, limited to block

You can instruct the computation to only fill part of the distance measures matrix. For example to distribute the computations over multiple nodes, or to only compare source series to target series.

from dtaidistance import dtw
import numpy as np
series = np.matrix([
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1],
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1]])
ds = dtw.distance_matrix_fast(series, block=((1, 4), (3, 5)))

The output in this case will be:

#  0     1    2    3       4       5
[[ inf   inf  inf     inf     inf  inf]    # 0
 [ inf   inf  inf  1.4142  0.0000  inf]    # 1
 [ inf   inf  inf  2.2360  1.7320  inf]    # 2
 [ inf   inf  inf     inf  1.4142  inf]    # 3
 [ inf   inf  inf     inf     inf  inf]    # 4
 [ inf   inf  inf     inf     inf  inf]]   # 5

Clustering

A distance matrix can be used for time series clustering. You can use existing methods such as scipy.cluster.hierarchy.linkage or one of two included clustering methods (the latter is a wrapper for the SciPy linkage method).

from dtaidistance import clustering
# Custom Hierarchical clustering
model1 = clustering.Hierarchical(dtw.distance_matrix_fast, {})
cluster_idx = model1.fit(series)
# Augment Hierarchical object to keep track of the full tree
model2 = clustering.HierarchicalTree(model1)
cluster_idx = model2.fit(series)
# SciPy linkage clustering
model3 = clustering.LinkageTree(dtw.distance_matrix_fast, {})
cluster_idx = model3.fit(series)

For models that keep track of the full clustering tree (HierarchicalTree or LinkageTree), the tree can be visualised:

model.plot("myplot.png")

Dynamic Time Warping (DTW) hierarchical clusteringt

Dependencies

Optional:

Development:

Contact

References

  1. T. K. Vintsyuk, Speech discrimination by dynamic programming. Kibernetika, 4:81–88, 1968.
  2. H. Sakoe and S. Chiba, Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1):43–49, 1978.
  3. C. S. Myers and L. R. Rabiner, A comparative study of several dynamic time-warping algorithms for connected-word recognition. The Bell System Technical Journal, 60(7):1389–1409, Sept 1981.
  4. Mueen, A and Keogh, E, Extracting Optimal Performance from Dynamic Time Warping, Tutorial, KDD 2016
  5. D. F. Silva, G. E. A. P. A. Batista, and E. Keogh. On the effect of endpoints on dynamic time warping, In SIGKDD Workshop on Mining and Learning from Time Series, II. Association for Computing Machinery-ACM, 2016.
  6. C. Yanping, K. Eamonn, H. Bing, B. Nurjahan, B. Anthony, M. Abdullah and B. Gustavo. The UCR Time Series Classification Archive, 2015.
  7. D. F. Silva and G. E. Batista. Speeding up all-pairwise dynamic time warping matrix calculation, In Proceedings of the 2016 SIAM International Conference on Data Mining, pages 837–845. SIAM, 2016.

License

DTAI distance code.

Copyright 2016-2022 KU Leuven, DTAI Research Group

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtaidistance-2.3.11.tar.gz (9.0 MB view details)

Uploaded Source

Built Distributions

dtaidistance-2.3.11-cp311-cp311-macosx_14_0_arm64.whl (1.2 MB view details)

Uploaded CPython 3.11 macOS 14.0+ ARM64

dtaidistance-2.3.11-cp310-cp310-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.10 Windows x86-64

dtaidistance-2.3.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.11-cp310-cp310-macosx_11_0_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

dtaidistance-2.3.11-cp39-cp39-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.9 Windows x86-64

dtaidistance-2.3.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.11-cp39-cp39-macosx_11_0_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

dtaidistance-2.3.11-cp38-cp38-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.8 Windows x86-64

dtaidistance-2.3.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.11-cp38-cp38-macosx_11_0_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.8 macOS 11.0+ x86-64

dtaidistance-2.3.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

File details

Details for the file dtaidistance-2.3.11.tar.gz.

File metadata

  • Download URL: dtaidistance-2.3.11.tar.gz
  • Upload date:
  • Size: 9.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for dtaidistance-2.3.11.tar.gz
Algorithm Hash digest
SHA256 a3e471dd1ed5b22eff3f5dd4b285e6df1ba5a28be3322f51eef9e6571ea7090b
MD5 2e307831a39fca90fed4003430907d37
BLAKE2b-256 2bc1311c764d9ee62bfa0faf8979108267f9922cd0d270ab496820643477e9df

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp311-cp311-macosx_14_0_arm64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp311-cp311-macosx_14_0_arm64.whl
Algorithm Hash digest
SHA256 bbb15caee33f175c37346fa303cfb66ad1ae1bfcc6192840ba8b8b7f75c71937
MD5 d3f89e1eddef422e386c48d6b959ea96
BLAKE2b-256 dbb970a8432879181c3565b9df3a84e5be6535ef60d233533d755076490e0a8a

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 32bf75498fec2e89de294309698ab6e4a420a21bb030628b04e5b124648c2a66
MD5 1040cdd6ab1f9735e230e6728f2c0065
BLAKE2b-256 a555e277c222c785d25bcac548b4fba020a7e3b1ba4e57378e13b186eb44de62

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f489bcf5776303b52207807eb173bff11b8b5e7a355830c6fd80be36f007fd8b
MD5 7ec4db443a1b847a0bf2ea6a597a939c
BLAKE2b-256 1edd6ee2a37cdc4a4c15e1a4d69281ed71ae4ce662c6644d4a3e98c9e98e9d69

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 ae49e46a84b8ab5563b13871cdb6fa78fe05991ab11a15dde36939f5463a4ca0
MD5 7d9166852410e77da133f8aeced6c1f6
BLAKE2b-256 2825ccb00399c28f6298ec7b9ac075a231197496665484c2b721d59e9924d7cd

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b7b056399c31aa2d27228fc995c2a3b52425e1c5f69a6fcb03a23556cdd0870f
MD5 e03a34c783d4c641180064afa81bdefc
BLAKE2b-256 c57f21ac5145e25b939049a07f62ec6c245363c1fab19d214026c2004b1d5234

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 44f6934f7458b97411c958238806db853c246f6a0e8e056f0a412ae879921533
MD5 7fa140827f3e027a3e3c2f08d6021a09
BLAKE2b-256 4612329f63deaeb4001b889e58ca09f9dcadb0236d4e071c7f1b2cccf6ea965b

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 6deec549e9aa14cc222d9eb73811b291ea4501183db6d1ee8476dc2a2234b989
MD5 bdab5fd697f1ff36a4e2746c88002fcb
BLAKE2b-256 2278254592eb1da7b4311766395491233721efc8c930edfd58d0416925b2d934

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e1011080168078e80952c96003629ce8048dc5d451801a1d38012ee09ae90fd8
MD5 702de5a9541baa869c27d4b1be76f272
BLAKE2b-256 9d53c9294975a16458fafaeffc0d3814b0c89f06747a18a45795556d0a20f5ac

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ed866da77b2377c5820350d767fd7d8097a84701d43c23e66b7e255bd6833c90
MD5 672f5db249a11e6b00768c777af799b7
BLAKE2b-256 c20084a1f1af04a5644b6362e782f8a53c9483d646c33bbd0ba7a6d7904ccacb

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp38-cp38-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp38-cp38-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 06fa79fed8430e243859afa13c80b5c6e323d962cf4fd17c7c34da91edb79b71
MD5 b46f48cabe5deeb2da23cf524d15e4a4
BLAKE2b-256 8712814f1c792e8782481181da6035581c09bf732eb8cf2e18cf7dc950dca990

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.11-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d84e738ab3ce12f912554dc5777232547443fbd1005d53a2309726f7525c9ead
MD5 10954d712ada4ffe6fc42ee8b32e5ac3
BLAKE2b-256 a1973761bbc4c6fc39d625a3592990d7f7bd80e8b9504e50564d8149aed078dc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page