Skip to main content

Distance measures for time series (Dynamic Time Warping, fast C implementation)

Project description

PyPi Version Conda Version Documentation Status DOI

Time Series Distances

Library for time series distances (e.g. Dynamic Time Warping) used in the DTAI Research Group. The library offers a pure Python implementation and a fast implementation in C. The C implementation has only Cython as a dependency. It is compatible with Numpy and Pandas and implemented such that unnecessary data copy operations are avoided.

Documentation: http://dtaidistance.readthedocs.io

Example:

from dtaidistance import dtw
import numpy as np
s1 = np.array([0.0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2)

Citing this work:

Wannes Meert, Kilian Hendrickx, Toon Van Craenendonck & Pieter Robberechts.
DTAIDistance (Version v2). Zenodo.
http://doi.org/10.5281/zenodo.5901139

New in v2:

  • Numpy is now an optional dependency, also to compile the C library (only Cython is required).
  • Small optimizations throughout the C code to improve speed.
  • The consistent use of ssize_t instead of int allows for larger data structures on 64 bit machines and be more compatible with Numpy.
  • The parallelization is now implemented directly in C (included if OpenMP is installed).
  • The max_dist argument turned out to be similar to Silva and Batista's work on PrunedDTW [7]. The toolbox now implements a version that is equal to PrunedDTW since it prunes more partial distances. Additionally, a use_pruning argument is added to automatically set max_dist to the Euclidean distance, as suggested by Silva and Batista, to speed up the computation (a new method ub_euclidean is available).
  • Support in the C library for multi-dimensional sequences in the dtaidistance.dtw_ndim package.
  • DTW Barycenter Averaging for clustering (v2.2).
  • Subsequence search and local concurrences (v2.3).
  • Support for N-dimensional time series (v2.3.7).

Installation

$ pip install dtaidistance

or

$ conda install -c conda-forge dtaidistance

The pip installation requires Numpy as a dependency to compile Numpy-compatible C code (using Cython). However, this dependency is optional and can be removed.

The source code is available at github.com/wannesm/dtaidistance.

If you encounter any problems during compilation (e.g. the C-based implementation or OpenMP is not available), see the documentation for more options.

Usage

Dynamic Time Warping (DTW) Distance Measure

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import numpy as np
s1 = np.array([0., 0, 1, 2, 1, 0, 1, 0, 0, 2, 1, 0, 0])
s2 = np.array([0., 1, 2, 3, 1, 0, 0, 0, 2, 1, 0, 0, 0])
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path, filename="warp.png")

Dynamic Time Warping (DTW) Example

DTW Distance Measure Between Two Series

Only the distance measure based on two sequences of numbers:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance = dtw.distance(s1, s2)
print(distance)

The fastest version (30-300 times) uses c directly but requires an array as input (with the double type), and (optionally) also prunes computations by setting max_dist to the Euclidean upper bound:

from dtaidistance import dtw
import array
s1 = array.array('d',[0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = array.array('d',[0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Or you can use a numpy array (with dtype double or float):

from dtaidistance import dtw
import numpy as np
s1 = np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double)
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Check the __doc__ for information about the available arguments:

print(dtw.distance.__doc__)

A number of options are foreseen to early stop some paths the dynamic programming algorithm is exploring or tune the distance measure computation:

  • window: Only allow for shifts up to this amount away from the two diagonals.
  • max_dist: Stop if the returned distance measure will be larger than this value.
  • max_step: Do not allow steps larger than this value.
  • max_length_diff: Return infinity if difference in length of two series is larger.
  • penalty: Penalty to add if compression or expansion is applied (on top of the distance).
  • psi: Psi relaxation to ignore begin and/or end of sequences (for cylical sequences) [2].
  • use_pruning: Prune computations based on the Euclidean upper bound.

DTW Distance Measure all warping paths

If, next to the distance, you also want the full matrix to see all possible warping paths:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance, paths = dtw.warping_paths(s1, s2)
print(distance)
print(paths)

The matrix with all warping paths can be visualised as follows:

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import random
import numpy as np
x = np.arange(0, 20, .5)
s1 = np.sin(x)
s2 = np.sin(x - 1)
random.seed(1)
for idx in range(len(s2)):
    if random.random() < 0.05:
        s2[idx] += (random.random() - 0.5) / 2
d, paths = dtw.warping_paths(s1, s2, window=25, psi=2)
best_path = dtw.best_path(paths)
dtwvis.plot_warpingpaths(s1, s2, paths, best_path)

DTW Example

Notice the psi parameter that relaxes the matching at the beginning and end. In this example this results in a perfect match even though the sine waves are slightly shifted.

DTW Distance Measures Between Set of Series

To compute the DTW distance measures between all sequences in a list of sequences, use the method dtw.distance_matrix. You can set variables to use more or less c code (use_c and use_nogil) and parallel or serial execution (parallel).

The distance_matrix method expects a list of lists/arrays:

from dtaidistance import dtw
import numpy as np
series = [
    np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double),
    np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0]),
    np.array([0.0, 0, 1, 2, 1, 0, 0, 0])]
ds = dtw.distance_matrix_fast(series)

or a matrix (in case all series have the same length):

from dtaidistance import dtw
import numpy as np
series = np.matrix([
    [0.0, 0, 1, 2, 1, 0, 1, 0, 0],
    [0.0, 1, 2, 0, 0, 0, 0, 0, 0],
    [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])
ds = dtw.distance_matrix_fast(series)

DTW Distance Measures Between Set of Series, limited to block

You can instruct the computation to only fill part of the distance measures matrix. For example to distribute the computations over multiple nodes, or to only compare source series to target series.

from dtaidistance import dtw
import numpy as np
series = np.matrix([
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1],
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1]])
ds = dtw.distance_matrix_fast(series, block=((1, 4), (3, 5)))

The output in this case will be:

#  0     1    2    3       4       5
[[ inf   inf  inf     inf     inf  inf]    # 0
 [ inf   inf  inf  1.4142  0.0000  inf]    # 1
 [ inf   inf  inf  2.2360  1.7320  inf]    # 2
 [ inf   inf  inf     inf  1.4142  inf]    # 3
 [ inf   inf  inf     inf     inf  inf]    # 4
 [ inf   inf  inf     inf     inf  inf]]   # 5

Clustering

A distance matrix can be used for time series clustering. You can use existing methods such as scipy.cluster.hierarchy.linkage or one of two included clustering methods (the latter is a wrapper for the SciPy linkage method).

from dtaidistance import clustering
# Custom Hierarchical clustering
model1 = clustering.Hierarchical(dtw.distance_matrix_fast, {})
cluster_idx = model1.fit(series)
# Augment Hierarchical object to keep track of the full tree
model2 = clustering.HierarchicalTree(model1)
cluster_idx = model2.fit(series)
# SciPy linkage clustering
model3 = clustering.LinkageTree(dtw.distance_matrix_fast, {})
cluster_idx = model3.fit(series)

For models that keep track of the full clustering tree (HierarchicalTree or LinkageTree), the tree can be visualised:

model.plot("myplot.png")

Dynamic Time Warping (DTW) hierarchical clusteringt

Dependencies

Optional:

Development:

Contact

References

  1. T. K. Vintsyuk, Speech discrimination by dynamic programming. Kibernetika, 4:81–88, 1968.
  2. H. Sakoe and S. Chiba, Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1):43–49, 1978.
  3. C. S. Myers and L. R. Rabiner, A comparative study of several dynamic time-warping algorithms for connected-word recognition. The Bell System Technical Journal, 60(7):1389–1409, Sept 1981.
  4. Mueen, A and Keogh, E, Extracting Optimal Performance from Dynamic Time Warping, Tutorial, KDD 2016
  5. D. F. Silva, G. E. A. P. A. Batista, and E. Keogh. On the effect of endpoints on dynamic time warping, In SIGKDD Workshop on Mining and Learning from Time Series, II. Association for Computing Machinery-ACM, 2016.
  6. C. Yanping, K. Eamonn, H. Bing, B. Nurjahan, B. Anthony, M. Abdullah and B. Gustavo. The UCR Time Series Classification Archive, 2015.
  7. D. F. Silva and G. E. Batista. Speeding up all-pairwise dynamic time warping matrix calculation, In Proceedings of the 2016 SIAM International Conference on Data Mining, pages 837–845. SIAM, 2016.

License

DTAI distance code.

Copyright 2016-2022 KU Leuven, DTAI Research Group

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtaidistance-2.3.10.tar.gz (847.2 kB view details)

Uploaded Source

Built Distributions

dtaidistance-2.3.10-cp310-cp310-win_amd64.whl (849.4 kB view details)

Uploaded CPython 3.10 Windows x86-64

dtaidistance-2.3.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.10-cp310-cp310-macosx_12_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

dtaidistance-2.3.10-cp310-cp310-macosx_10_15_x86_64.whl (947.7 kB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

dtaidistance-2.3.10-cp39-cp39-win_amd64.whl (854.6 kB view details)

Uploaded CPython 3.9 Windows x86-64

dtaidistance-2.3.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.10-cp39-cp39-macosx_10_15_x86_64.whl (954.6 kB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

dtaidistance-2.3.10-cp38-cp38-win_amd64.whl (860.3 kB view details)

Uploaded CPython 3.8 Windows x86-64

dtaidistance-2.3.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.10-cp38-cp38-macosx_10_15_x86_64.whl (940.3 kB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

dtaidistance-2.3.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

File details

Details for the file dtaidistance-2.3.10.tar.gz.

File metadata

  • Download URL: dtaidistance-2.3.10.tar.gz
  • Upload date:
  • Size: 847.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for dtaidistance-2.3.10.tar.gz
Algorithm Hash digest
SHA256 0da309e34a4a8cc47ba3c84cffa072361917bb411a4b61cf7d3ffc2d457cbff2
MD5 eb136c80a3a63084fa1f5d2e69792471
BLAKE2b-256 e71804170840514a0d135bcff8f58198e06cf51cfe9a08c38e666e350e7f3f25

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 acc47a6fa6fabf49a1ed3265f41b00c850c8d2971a2273352a0b113fbbaf1e8d
MD5 3341708ec6a9f6f233ffb6769dad8531
BLAKE2b-256 5909da08b320f92dbbe09aa4c215c268e6c0c12c1ebb1ead9555de529d684748

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7e3c21b28a71215ca4e794d37bf0535858f7a99fda712716f2dabb597849b7bf
MD5 e1106501e31f97bcc66965f0630da79e
BLAKE2b-256 6a4963b4ddfd179bdc7e9f85629a1eb1bd8477c645d7fbb38d4b95107f8aeaa6

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 3fb80a4ff415a96a927766a465db0640299a8942e1ca1586ef906cd811a419d7
MD5 81731ffdac7975e46e7c0e1f8fdd1aec
BLAKE2b-256 2960bb2b432f6c70035375fb1584541f104ea03f098623601d27006cee8ef7f2

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 85c052dfd13d34c619e3682846b10f7ff247e908fad8a174424d20a362e2a164
MD5 47388b4f8d49c526aa7cfa2061b82d81
BLAKE2b-256 5b33d3ee9c50f2ef35d77247bc216ef7900ea79cc9c0b39679fe1e2a8ef16ad1

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f7318a482e0b6f756ded9cbce13f127d1b7104e04eb744a2aafed3c997851368
MD5 6537808bc555716cc770600c6314f2e2
BLAKE2b-256 33147e68a31e44b832130fb0e7f233e08494035d9626cdc86d9810232ff3637c

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6205737b0e7d04a1added8ae5a95d88b5e17054a814f30c0ac17c8b917311aef
MD5 b56bf3db0397fa08cd5edcdff5cee56a
BLAKE2b-256 8078c6224c498d37f052c16997c9670c26d986c68e5d92c292a20a59df33baca

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 58b8dbde8acd9b983913df1ff0caac4cd4a8d5fa9d97aa350e97186729db0c0f
MD5 57f06d072c950cb83285f964a665138d
BLAKE2b-256 ef49341d1a7980c52dbb233f7bd8a424134e1043f11918e780dd676402e154cb

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8bf00ee46ee305db842dfdcdb3ba932f4f4a283ee3f75eaa73fe0ab29b73fdee
MD5 fa30ecaad5518830fc643ca1658ca0bd
BLAKE2b-256 0871a2cc9942fa29fac49cd6fa839b9e0a8da7afdc1f0199dd74acea3ac5f274

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f1e27773d4c534160036a8a08f4009731a6c4e16a95b52403c2d67e1d288bb3
MD5 a9fea920ddf3507e8c15925758459d50
BLAKE2b-256 90f196141eb3e2a428fa1086ca06887e1ca74b3046a05b4e4e80d2a6daa17464

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 e1f96f0f4e6e4759447aa64c509a48bb38b5189631a559144869fd0f0262a906
MD5 fe5e1a9db9ab44c96f03dff855cd64d6
BLAKE2b-256 e7b16b8c3e096f93f37b688e1b64586732798ea89a391ab044fd9b7c44dabd9c

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 addbe46dace5897fd7186a32ec850d39db2ce501e76c2b52bcf5b0d1048bfeaa
MD5 139111268e183754f10c53568fd0c766
BLAKE2b-256 7f41f1103a14b665d74add49fd0a342a1ea4bed2c7199c5b3ec5eadb42a55835

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page