Discovering Leitmotifs in Multidimensional Time Series
Project description
Discovering Leitmotifs in Multidimensional Time Series
This page was built in support of our paper "Discovering Leitmotifs in Multidimensional Time Series" by Patrick Schäfer and Ulf Leser.
A leitmotif is a recurring theme in literature, movies or music that carries symbolic significance for the piece it is contained in. When this piece can be represented as a multi-dimensional time series (MDTS), such as acoustic or visual observations, finding a leitmotif is equivalent to the pattern discovery problem, which is an unsupervised and complex problem in time series analytics. Compared to the univariate case, it carries additional complexity because patterns typically do not occur in all dimensions but only in a few - which are, however, unknown and must be detected by the method itself. In this paper, we present the novel, efficient and highly effective leitmotif discovery algorithm LAMA for MDTS. LAMA rests on two core principals: (a) a leitmotif manifests solely given a yet unknown number of sub-dimensions - neither too few, nor too many, and (b) the set of sub-dimensions are not independent from the best pattern found therein, necessitating both problems to be approached in a joint manner. In contrast to most previous methods, LAMA tackles both problems jointly - instead of first selecting dimensions (or leitmotifs) and then finding the best leitmotifs (or dimensions).
Supporting Material
tests
: Please see the python tests for use casesnotebooks
: Please see the Jupyter Notebooks for use casescsvs
: The results of the scalability experimentsleitmotifs
: Code implementing multidimensonal leitmotif discovery using LAMAdatasets
: Use cases in the paper
Mining Leitmotifs - Use Case
A leitmotif (leading motif) is a recurring theme or motif that carries symbolic significance in various forms of art, particularly literature, movies, and music. The distinct feature of any leitmotif is that humans associate them to meaning, which enhances narrative cohesion and establishes emotional connections with the audience. The use of (leit)motifs thus eases perception, interpretation, and identification with the underlying narrative. A genre that often uses leitmotifs are soundtracks, for instance in the compositions of Hans Zimmer or Howard Shore. The above figure shows a suite from The Shire with 14 channels arranged by Howard Shore for Lord of the Rings. The suite opens and ends with the Hobbits' leitmotif, which is played by a solo tin whistle, and manifests in a distinct pattern in several, but not all channels of the piece.
You may listen to the song on youtube - starting from second ~6, the same melody (leitmotif) is played by the tin whistle twice, and the this theme is repeated after 2 minutes into the song:
The result of leitmotif discovery is shown next:
Our LAMA (in brown) is the only method to correctly identify 4 occurrences within the leitmotif using a distinctive subset of channels. Other than EMD*, LAMA's occurrences show high pairwise similarity, too.
Installation
The easiest is to use pip to install leitmotif.
a) Install using pip
pip install leitmotif
You can also install the project from source.
b) Build from Source
First, download the repository.
git clone https://github.com/patrickzib/leitmotifs.git
Change into the directory and build the package from source.
pip install .
Usage of LAMA
The three hyper-parameters of LAMA are:
- n_dims : Number of subdimensions to use
- k_max : The largest expected number of repeats. LAMA will search from to for motif sets
- motif_length_range: The range of lengths to test
LAMA has a simple OO-API.
ml = LAMA(
ds_name, # Name of the dataset
series, # Multidimensional time series
distance, # Distance measure used, default: z-normed ED
n_dims, # Number of sub-dimensions to use
n_jobs, # number of parallel jobs
)
The result will look like this, indicating the found motif set to the right, its dimensions and the k locations to the bottom.
LAMA has a unique feature to automatically find suitable values for the motif length and set size so, that meaningful Leitmotifs of an input TS can be found without domain knowledge. The methods for determining values for $k$ and $l$ are based on an analysis of the extent function for different input value ranges.
Learning the Leitmotif length
To learn the motif length, we may simply call:
ml.fit_motif_length(
k_max, # expected number of repeats
motif_length_range, # motif length range
plot, # Plot the results
plot_elbows, # Create an elbow plot
plot_motifsets, # Plot the found motif sets
plot_best_only # Plot only the motif sets of the optimal length. Otherwise plot all local optima in lengths
)
To do variable length motif discovery simply set plot_best_only=False
The generated plots looks like this, with good window lengths at local minima:
Learning the number of repeats
To do an elbow plot, and learn the number of repeats of the motif, we may simply call:
ml.fit_k_elbow(
k_max, # expected number of repeats
motif_length, # motif length to use
plot_elbows, # Plot the elbow plot
plot_motifsets # Plot the found motif sets
)
The generated plots looks like this, with good number of repeats at local minima:
Use Cases
Data Sets: We collected and annotated 14 challenging real-life data sets to assess the quality and scalability of the LAMA algorithm.
Use Case | Category | Length | Dim. | GT |
---|---|---|---|---|
Charleston | Motion Capture | 506 | 93 | 3 |
Boxing | Motion Capture | 4840 | 93 | 10 |
Swordplay | Motion Capture | 2251 | 93 | 6 |
Basketball | Motion Capture | 721 | 93 | 5 |
LOTR - The Shire | Soundtrack | 6487 | 20 | 4 |
SW - The Imperial March | Soundtrack | 8015 | 20 | 5 |
RS - Paint it black | Pop Music | 9744 | 20 | 10 |
Linkin Park - Numb | Pop Music | 8018 | 20 | 5 |
Linkin P. - What I've Done | Pop Music | 8932 | 20 | 6 |
Queen - Under Pressure | Pop Music | 9305 | 20 | 16 |
Vanilla Ice - Ice Ice Baby | Pop Music | 11693 | 20 | 20 |
Starling | Wildlife Rec. | 2839 | 20 | 4 |
Physiodata (Physical Exercises) | Wearable Sensors | 5526 | 5 | 20 |
Bitcoin Halving | Crypto/Stocks | 3591 | 5 | 3 |
Aggregated Results
Method | Precision, mean | Precision, median | Recall, mean | Recall, median |
---|---|---|---|---|
EMD* | 59.3 | 65 | 75.9 | 80 |
K-Motifs (TOP-f) | 61.1 | 70 | 70.8 | 100 |
K-Motifs (all dims) | 76.8 | 83.3 | 82.6 | 100 |
SMM | 31.8 | 26.5 | 65.4 | 95 |
mSTAMP | 53.8 | 100 | 36.7 | 20 |
mSTAMP+MDL | 46.2 | 0 | 29.0 | 0 |
LAMA | 88.7 | 100 | 95.1 | 100 |
See all results in Results Notebook.
Notebooks
-
Jupyter-Notebooks for finding subdimensional Leitmotifs in a multidimensional time series Multivariate Use Case: highlights a use case used in the paper, and shows the unique ability to learn its parameters from the data and find interesting motif sets.
-
Jupter-Notebook showcasing SMM-Results (SMM ran using Matlab): SMM-Results.
-
Jupter-Notebook showcasing using Leitmotifs for Summarization: Summarization.
-
Jupter-Notebook showcasing BitCoint-Halving Events: Bitcoin-Halving.
-
All other use cases presented in the paper can be found in the test folder
Citation
If you use this work, please cite as:
TODO
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file leitmotif-0.2.2.tar.gz
.
File metadata
- Download URL: leitmotif-0.2.2.tar.gz
- Upload date:
- Size: 72.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bc06db4e98a0e65a074279a6da4517229c95c1b83e683020a39967f94d0b1ce0 |
|
MD5 | 682280ae4cd3c47151d65e2df5d34a3d |
|
BLAKE2b-256 | efd14e3e4df9a8d621814c575c92c22669608e43a199ef83346e3fc87bae9323 |