Skip to main content

Statistical Significance Criteria for Multivariate Time Series Motifs

Project description

MSig

PyPI version Python versions License: MIT Downloads

Statistical-significance testing for multidimensional time-series motifs.

MSig evaluates whether discovered motifs occur more frequently than expected by chance, using rigorous statistical methods. It accompanies the paper Silva, Madeira & Henriques (2026), Pattern Recognition Letters — see Citation.

What's new in 0.2.x: comprehensive correctness, accessibility, and reproducibility revision. See CHANGELOG.md for details, REPRODUCING_EXPERIMENTS.md for paper-vs-code reconciliation, and CONTRIBUTING.md to get involved.

Installation

Quick Install from PyPI

pip install msig                    # Core package only
pip install "msig[experiments]"     # With experiment dependencies

Development Install with uv (Recommended)

git clone https://github.com/MiguelGarcaoSilva/msig.git
cd msig
uv sync                             # Installs all dependencies

# Optional: MOMENTI (Linux x86_64/Windows only)
uv pip install git+https://github.com/aidaLabDEI/MOMENTI-motifs

Requirements:

  • Python 3.11-3.13 (3.14+ may have LAMA compatibility issues)
  • ffmpeg (for audio experiments): brew install ffmpeg (macOS) or sudo apt-get install ffmpeg (Linux)

Quick Start

from msig import Motif, NullModel
import numpy as np

# Create sample multivariate time series (3 sensors × 100 time points)
np.random.seed(42)
t = np.linspace(0, 10, 100)
sensor1 = 10 + 2 * np.sin(2 * np.pi * t) + np.random.randn(100) * 0.5
sensor2 = 5 + 1.5 * np.cos(2 * np.pi * t) + np.random.randn(100) * 0.3
sensor3 = 15 + 3 * np.sin(2 * np.pi * t + np.pi/4) + np.random.randn(100) * 0.7
data = np.stack([sensor1, sensor2, sensor3])

# Create null model (assumes Gaussian distributions)
model = NullModel(data, dtypes=[float, float, float], model="gaussian_theoretical")

# Define a motif: length 10, all 3 sensors, 8 occurrences
motif_length = 10
motif_vars = np.array([0, 1, 2])  # Use all sensors
motif_pattern = data[motif_vars, 5:15]  # Extract pattern from position 5 across selected variables
delta_thresholds = np.array([0.3, 0.3, 0.3])  # Tolerance for matching

# Create motif and test significance
motif = Motif(motif_pattern, motif_vars, delta_thresholds, n_matches=8)
prob = motif.set_pattern_probability(model, vars_indep=True)
pvalue = motif.set_significance(
    max_possible_matches=100 - motif_length + 1,
    data_n_variables=3,
    idd_correction=False
)

print(f"Pattern probability: {prob:.6e}")
print(f"P-value: {pvalue:.6e}")
print(f"Significant at α=0.01? {pvalue <= 0.01}")

See the examples/ folder for more examples (simple_example.py and example.ipynb).

Running Experiments

The repository includes case studies on three datasets (audio, population density, washing machine) with three discovery methods (STUMPY, LAMA, MOMENTI).

Validate Setup

uv run python validate_reproducibility.py

Run All Experiments

uv run python run_experiments.py --all
uv run python scripts/compare_results.py  # Compare results

Note: MOMENTI requires uv pip install git+https://github.com/aidaLabDEI/MOMENTI-motifs (Linux x86_64/Windows only)

Individual Experiments

# Audio experiments
uv run python experiments/audio/run_stumpy.py
uv run python experiments/audio/run_lama.py
uv run python experiments/audio/run_momenti.py

# Population density
uv run python experiments/populationdensity/run_stumpy.py
uv run python experiments/populationdensity/run_lama.py
uv run python experiments/populationdensity/run_momenti.py

# Washing machine
uv run python experiments/washingmachine/run_stumpy.py
uv run python experiments/washingmachine/run_lama.py
uv run python experiments/washingmachine/run_momenti.py

Experiment Runner Options

# Run specific combinations
uv run python run_experiments.py --dataset audio           # All methods on audio
uv run python run_experiments.py --method stumpy           # STUMPY on all datasets
uv run python run_experiments.py --dataset audio --method lama

# Preview what would run
uv run python run_experiments.py --all --dry-run

Understanding Results

Results are saved to results/<dataset>/<method>/:

  • summary_motifs_{method}.csv (tracked in git)

    • Aggregated statistics per motif length
    • Columns: s (length), k (dimensionality), #Matches, P (probability), p-value, significant
  • metadata.json (tracked in git)

    • Experiment parameters and environment info
    • Python/library versions, timestamps
  • table_motifs_{method}.csv (not tracked - regenerate if needed)

    • Detailed per-motif information with indices

Key result interpretations:

  • Low p-value (< 0.05): Motif is statistically significant
  • Low P (pattern probability): Motif is rare under null hypothesis
  • High #Matches: Motif occurs frequently
  • significant=True: Occurs more often than expected by chance (after FDR correction)

Comparing Results

uv run python scripts/compare_results.py                    # Full comparison
uv run python scripts/compare_results.py --dataset audio    # Methods on audio
uv run python scripts/compare_results.py --method stumpy    # STUMPY across datasets

Datasets

  1. Audio: 12 MFCC features from imblue.mp3 - musical patterns (beats, measures, phrases)
  2. Population Density: 3 variables (Terminals, Roaming, Calls) - daily urban mobility patterns
  3. Washing Machine: 7 sensor variables - operating mode patterns

Testing

uv run python validate_reproducibility.py  # Validate environment
uv run python -m pytest tests/ -v          # Run unit tests

Citation

@article{silva2026and,
  title={On Why and How Statistical Significance Criteria Can Guide Multivariate Time Series Motif Analysis},
  author={Silva, Miguel G and Madeira, Sara C and Henriques, Rui},
  journal={Pattern Recognition Letters},
  year={2026},
  publisher={Elsevier}
}

License

MIT License - see LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msig-0.2.1.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

msig-0.2.1-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file msig-0.2.1.tar.gz.

File metadata

  • Download URL: msig-0.2.1.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for msig-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0542973d09d93b5fd05639e90421e0642facd8830a8793cdcd1ce16edc6f4c10
MD5 e7c241970b9f304b45be88cbe9086b00
BLAKE2b-256 6ca53e498fdbcd8ee9e5756f110ea0e4af2aee112c8e8cde7839e4519fc5c46d

See more details on using hashes here.

File details

Details for the file msig-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: msig-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for msig-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 38895ad1fb0739e12f717823eefba713b5a798e37cc1483017a4645069f3a906
MD5 9dbed3a5f69d1f3358c87322fa7fcc82
BLAKE2b-256 1021932ea9d34a6ca2e1c5a093196c0bd0f88b89b0cd27c622b29edef101e6ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page