Skip to main content

Distance measures for time series (Dynamic Time Warping, fast C implementation)

Project description

PyPi Version Conda Version Documentation Status

Time Series Distances

Library for time series distances (e.g. Dynamic Time Warping) used in the DTAI Research Group. The library offers a pure Python implementation and a fast implementation in C. The C implementation has only Cython as a dependency. It is compatible with Numpy and Pandas and implemented to avoid unnecessary data copy operations.

Documentation: http://dtaidistance.readthedocs.io

Citing this work: DOI

Wannes Meert, Kilian Hendrickx, & Toon Van Craenendonck. wannesm/dtaidistance (Version v2.0.0). Zenodo. http://doi.org/10.5281/zenodo.3981067

New in v2:

  • Numpy is now an optional dependency, also to compile the C library (only Cython is required).
  • Small optimizations throughout the C code to improve speed.
  • The consistent use of ssize_t instead of int allows for larger data structures on 64 bit machines and be more compatible with Numpy.
  • The parallelization is now implemented directly in C (included if OpenMP is installed).
  • The max_dist argument turned out to be similar to Silva and Batista's work on PrunedDTW [7]. The toolbox now implements a version that is equal to PrunedDTW since it prunes more partial distances. Additionally, a use_pruning argument is added to automatically set max_dist to the Euclidean distance, as suggested by Silva and Batista, to speed up the computation (a new method ub_euclidean is available).
  • Support in the C library for multi-dimensional sequences in the dtaidistance.dtw_ndim package.
  • DTW Barycenter Averaging for clustering (v2.2).
  • Subsequence search and local concurrences (v2.3).

Installation

$ pip install dtaidistance

or

$ conda install -c conda-forge dtaidistance

In case the C based version is not available, see the documentation for alternative installation options. In case OpenMP is not available on your system add the --noopenmp global option.

The library has no dependency on Numpy. But if Numpy is available, some additional functionality is provided. If you want to make sure this is also installed then use:

$ pip install dtaidistance[all]

The source code is available at github.com/wannesm/dtaidistance.

If you encounter any problems during compilation, see the documentation for more options.

Usage

Dynamic Time Warping (DTW) Distance Measure

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import numpy as np
s1 = np.array([0., 0, 1, 2, 1, 0, 1, 0, 0, 2, 1, 0, 0])
s2 = np.array([0., 1, 2, 3, 1, 0, 0, 0, 2, 1, 0, 0, 0])
path = dtw.warping_path(s1, s2)
dtwvis.plot_warping(s1, s2, path, filename="warp.png")

Dynamic Time Warping (DTW) Example

DTW Distance Measure Between Two Series

Only the distance measure based on two sequences of numbers:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance = dtw.distance(s1, s2)
print(distance)

The fastest version (30-300 times) uses c directly but requires an array as input (with the double type), and (optionally) also prunes computations by setting max_dist to the Euclidean upper bound:

from dtaidistance import dtw
import array
s1 = array.array('d',[0, 0, 1, 2, 1, 0, 1, 0, 0])
s2 = array.array('d',[0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Or you can use a numpy array (with dtype double or float):

from dtaidistance import dtw
import numpy as np
s1 = np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double)
s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])
d = dtw.distance_fast(s1, s2, use_pruning=True)

Check the __doc__ for information about the available arguments:

print(dtw.distance.__doc__)

A number of options are foreseen to early stop some paths the dynamic programming algorithm is exploring or tune the distance measure computation:

  • window: Only allow for shifts up to this amount away from the two diagonals.
  • max_dist: Stop if the returned distance measure will be larger than this value.
  • max_step: Do not allow steps larger than this value.
  • max_length_diff: Return infinity if difference in length of two series is larger.
  • penalty: Penalty to add if compression or expansion is applied (on top of the distance).
  • psi: Psi relaxation to ignore begin and/or end of sequences (for cylical sequences) [2].
  • use_pruning: Prune computations based on the Euclidean upper bound.

DTW Distance Measure all warping paths

If, next to the distance, you also want the full matrix to see all possible warping paths:

from dtaidistance import dtw
s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]
s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]
distance, paths = dtw.warping_paths(s1, s2)
print(distance)
print(paths)

The matrix with all warping paths can be visualised as follows:

from dtaidistance import dtw
from dtaidistance import dtw_visualisation as dtwvis
import random
import numpy as np
x = np.arange(0, 20, .5)
s1 = np.sin(x)
s2 = np.sin(x - 1)
random.seed(1)
for idx in range(len(s2)):
    if random.random() < 0.05:
        s2[idx] += (random.random() - 0.5) / 2
d, paths = dtw.warping_paths(s1, s2, window=25, psi=2)
best_path = dtw.best_path(paths)
dtwvis.plot_warpingpaths(s1, s2, paths, best_path)

DTW Example

Notice the psi parameter that relaxes the matching at the beginning and end. In this example this results in a perfect match even though the sine waves are slightly shifted.

DTW Distance Measures Between Set of Series

To compute the DTW distance measures between all sequences in a list of sequences, use the method dtw.distance_matrix. You can set variables to use more or less c code (use_c and use_nogil) and parallel or serial execution (parallel).

The distance_matrix method expects a list of lists/arrays:

from dtaidistance import dtw
import numpy as np
series = [
    np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double),
    np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0]),
    np.array([0.0, 0, 1, 2, 1, 0, 0, 0])]
ds = dtw.distance_matrix_fast(series)

or a matrix (in case all series have the same length):

from dtaidistance import dtw
import numpy as np
series = np.matrix([
    [0.0, 0, 1, 2, 1, 0, 1, 0, 0],
    [0.0, 1, 2, 0, 0, 0, 0, 0, 0],
    [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])
ds = dtw.distance_matrix_fast(series)

DTW Distance Measures Between Set of Series, limited to block

You can instruct the computation to only fill part of the distance measures matrix. For example to distribute the computations over multiple nodes, or to only compare source series to target series.

from dtaidistance import dtw
import numpy as np
series = np.matrix([
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1],
     [0., 0, 1, 2, 1, 0, 1, 0, 0],
     [0., 1, 2, 0, 0, 0, 0, 0, 0],
     [1., 2, 0, 0, 0, 0, 0, 1, 1]])
ds = dtw.distance_matrix_fast(series, block=((1, 4), (3, 5)))

The output in this case will be:

#  0     1    2    3       4       5
[[ inf   inf  inf     inf     inf  inf]    # 0
 [ inf   inf  inf  1.4142  0.0000  inf]    # 1
 [ inf   inf  inf  2.2360  1.7320  inf]    # 2
 [ inf   inf  inf     inf  1.4142  inf]    # 3
 [ inf   inf  inf     inf     inf  inf]    # 4
 [ inf   inf  inf     inf     inf  inf]]   # 5

Clustering

A distance matrix can be used for time series clustering. You can use existing methods such as scipy.cluster.hierarchy.linkage or one of two included clustering methods (the latter is a wrapper for the SciPy linkage method).

from dtaidistance import clustering
# Custom Hierarchical clustering
model1 = clustering.Hierarchical(dtw.distance_matrix_fast, {})
cluster_idx = model1.fit(series)
# Augment Hierarchical object to keep track of the full tree
model2 = clustering.HierarchicalTree(model1)
cluster_idx = model2.fit(series)
# SciPy linkage clustering
model3 = clustering.LinkageTree(dtw.distance_matrix_fast, {})
cluster_idx = model3.fit(series)

For models that keep track of the full clustering tree (HierarchicalTree or LinkageTree), the tree can be visualised:

model.plot("myplot.png")

Dynamic Time Warping (DTW) hierarchical clusteringt

Dependencies

Optional:

Development:

Contact

References

  1. T. K. Vintsyuk, Speech discrimination by dynamic programming. Kibernetika, 4:81–88, 1968.
  2. H. Sakoe and S. Chiba, Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1):43–49, 1978.
  3. C. S. Myers and L. R. Rabiner, A comparative study of several dynamic time-warping algorithms for connected-word recognition. The Bell System Technical Journal, 60(7):1389–1409, Sept 1981.
  4. Mueen, A and Keogh, E, Extracting Optimal Performance from Dynamic Time Warping, Tutorial, KDD 2016
  5. D. F. Silva, G. E. A. P. A. Batista, and E. Keogh. On the effect of endpoints on dynamic time warping, In SIGKDD Workshop on Mining and Learning from Time Series, II. Association for Computing Machinery-ACM, 2016.
  6. C. Yanping, K. Eamonn, H. Bing, B. Nurjahan, B. Anthony, M. Abdullah and B. Gustavo. The UCR Time Series Classification Archive, 2015.
  7. D. F. Silva and G. E. Batista. Speeding up all-pairwise dynamic time warping matrix calculation, In Proceedings of the 2016 SIAM International Conference on Data Mining, pages 837–845. SIAM, 2016.

License

DTAI distance code.

Copyright 2016-2021 KU Leuven, DTAI Research Group

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dtaidistance-2.3.6.tar.gz (797.6 kB view details)

Uploaded Source

Built Distributions

dtaidistance-2.3.6-cp310-cp310-win_amd64.whl (770.6 kB view details)

Uploaded CPython 3.10 Windows x86-64

dtaidistance-2.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.6-cp310-cp310-macosx_10_15_x86_64.whl (852.2 kB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

dtaidistance-2.3.6-cp39-cp39-win_amd64.whl (770.6 kB view details)

Uploaded CPython 3.9 Windows x86-64

dtaidistance-2.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.6-cp39-cp39-macosx_10_15_x86_64.whl (852.2 kB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

dtaidistance-2.3.6-cp38-cp38-win_amd64.whl (770.2 kB view details)

Uploaded CPython 3.8 Windows x86-64

dtaidistance-2.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

dtaidistance-2.3.6-cp38-cp38-macosx_10_14_x86_64.whl (843.3 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

File details

Details for the file dtaidistance-2.3.6.tar.gz.

File metadata

  • Download URL: dtaidistance-2.3.6.tar.gz
  • Upload date:
  • Size: 797.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6.tar.gz
Algorithm Hash digest
SHA256 0cca03bd59a11fe90ff49dd98879a39324994c6fe627ff7070d59aaac8fb83bc
MD5 1204aa136dca90d1d8dc2d0b31539f0b
BLAKE2b-256 44ce721b5f179b513749fb5729411d55d22da601fbc970efbad0777c42743c20

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 770.6 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b22550e4966f1531a521d8441f04b3712989fbd7a08d8aab3247d69e29a01645
MD5 590a3eca7055f06681677fe45cae5890
BLAKE2b-256 bcff2cd48088eccb00c1c8bc6fb0bf31d3bef044a7d878767777d40753cbb5e8

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 39247eec759f60863cc6e2d18538dfc86eafe0b99ac6108dbd5342a1e8bb9b9e
MD5 5cc3e0e3f167d69cc64ebd2dd7e4e98d
BLAKE2b-256 d94047176ca1808c50e64080812b654d981bba79840c78e9412b651e49ef313f

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp310-cp310-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 852.2 kB
  • Tags: CPython 3.10, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 257ca17ffabd610f347d3f5a6f874985ef7dc47a3c7d034e06ab021e4a0c19a4
MD5 b00f1692e302da4e7e69cf0d6e94a1cb
BLAKE2b-256 985b1f0000d5d6426a97150e219243a07a75230e9d7e64ebc04615ff9be5d3c6

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 770.6 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3d5e862c16c09fb68891187aed74fad05a4b4fdb8c0652291afa84f20d894da1
MD5 1e73c6f5ede81f08c6fc50a1acf493ee
BLAKE2b-256 6f80bdb9dc70d25a4ce6a3ef29ae3061bfdb78e041df63b5c6835eb0db5f5ead

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fec23671af8608d3b984d4ccb329c1bcd13071ee996257d8ebe59f99c4b4344c
MD5 8929bed366fd71a9cc2d464d83e00cab
BLAKE2b-256 637a3ff766eeac9d74b6acb9adf21c24a6e238cb8e44bf34176548b1be724260

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp39-cp39-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 852.2 kB
  • Tags: CPython 3.9, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 64d390b4487758b4d265e5f160b6eb9dcdbca3b9e907264e30ad8bf7816bae52
MD5 6d3b1759e858fd7baa960322a5a0553d
BLAKE2b-256 6103a889eb484ca6d5f91d312a96b58603c9c469b397e5bb8e4a18369d7369ce

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 770.2 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 6b8cb1ce804bfa92ce1ea9ba138c16754d514de7b51b3c38f64c5e03d46d8676
MD5 c1301f30a482aaadddf8f7672fc336e4
BLAKE2b-256 e55502079d33185319999a51e3e78a02ccc9c3cf738f2d54b8276c00b90a0551

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dtaidistance-2.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6a5babf028e6758f89b09909101bfc23001ae8c796033b0e58347dd54993a3e4
MD5 8c2968e33a780793a5e31287643d96b8
BLAKE2b-256 a6ad1ba38cd5abf92a64be259869e53f478b4488477f4c6882111421474961e6

See more details on using hashes here.

File details

Details for the file dtaidistance-2.3.6-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: dtaidistance-2.3.6-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 843.3 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.10

File hashes

Hashes for dtaidistance-2.3.6-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eecf67db2e178fa0a45ea75624987d9633baa61d8fb062743dfec98ceb889d76
MD5 52928e52b3e769567f1e36016fb8ee9e
BLAKE2b-256 fc2f5a2ba4ac218abad8a872334a360a9bf714da090cbf4129696db233d6746c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page