Skip to main content

Pure-numpy Matrix Profile (motif and anomaly discovery in time series). No Numba required.

Project description

fastmatrix-profile

Pure-numpy Matrix Profile for time-series motif and anomaly detection. No Numba. No C extensions. No GPU. Just numpy + scipy.

Status: v0.1.0 — 26/26 tests passing, output matches STUMPY, NYC Taxi tutorial recovers 4/5 documented anomalies.

What it does

Given a 1D time series T and a window length m, computes the Matrix Profile P — a 1D summary where:

  • argmin(P) points to a repeated pattern (motif)
  • argmax(P) points to the most unusual subsequence (anomaly / discord)

One call, no tuning, no labels.

Install

pip install fastmatrix-profile

30-second example

import numpy as np
from fastmatrix_profile import matrix_profile

T = np.loadtxt("your_timeseries.csv")
P, I = matrix_profile(T, m=100)

anomaly_idx = int(P.argmax())
motif_idx   = int(P.argmin())

That's it. See examples/nyc_taxi_anomalies.ipynb for a real walk-through on the Numenta NAB NYC taxi dataset — recovers the NYC Marathon, Christmas, New Year and the Jan 2015 snowstorm without any parameter tuning beyond m.

Benchmarks vs STUMPY

Measured on macOS arm64, Python 3.12, scipy + Accelerate BLAS. See the interactive demo for the raw numbers and methodology.

Warm wall-time (median of 3 reps after warmup)

n m fastmatrix-profile (default) + dtype="float32" STUMPY winner
1,000 50 1.4 ms 1.0 ms 7.4 ms ours, 5–7×
2,000 50 6.4 ms 4 ms 11.1 ms ours, 1.7–2.5×
5,000 100 39 ms 19 ms 27 ms f32 beats STUMPY (1.4×)
10,000 100 136 ms 72 ms 62 ms STUMPY (2.2× / 1.15×)
20,000 100 590 ms 263 ms 175 ms STUMPY (3.4× / 1.5×)

dtype="float32" is opt-in: ~2× faster at the cost of ~3e-5 absolute error in P (argmin/argmax positions unchanged in practice — verified on the NYC taxi dataset, both modes flag identical anomalies).

Cold start (fresh Python process, n=2000, m=50)

Time
fastmatrix-profile 0.17 s
STUMPY 8.6 s (49× slower — Numba JIT compile)

When to use this vs STUMPY

Situation Use
Long series (n > ~10k), Numba installs fine, many calls per process STUMPY
Lambda / edge / Docker-slim / Pyodide / locked-down corporate Python fastmatrix-profile
One-shot calls or interactive notebooks where cold start matters fastmatrix-profile
GPU available, very large n STUMPY-CUDA or SCAMP

This is not a STUMPY replacement at large n. It's the option for when you can't or don't want to install a JIT toolchain.

Algorithm

Uses a row-tiled BLAS-batched self-join: form the z-normalised window matrix Wz of shape (L, m), then sweep row blocks of Wz, multiply each block against Wz.T in a single gemm call, mask the trivial-match band, reduce to per-row min, and discard the tile. Peak memory: tile × L floats per tile (tens of MB) instead of the GBs a full (L, L) Gram matrix would need. Cost: O(L² m) flops at near-peak BLAS throughput.

Falls back to the STOMP recurrence (Zhu et al., ICDM 2016) when even a minimum-size tile would exceed max_mem_mb (default 512 MB) — typically n ≳ 1M. You can call matrix_profile(very_long_series, m) without blowing up RAM; it will just be slower past the dense regime.

dtype="float32" runs the inner GEMM in single precision: ~2× faster, ~half the peak memory, ~3e-5 absolute error in P (argmin/argmax positions unchanged in practice).

mass(Q, T) is exposed as a standalone primitive (Mueen's MASS distance profile via FFT).

Limitations

  • 1D series only. No multi-dimensional matrix profile.
  • No streaming / incremental updates.
  • No GPU.
  • NaN/Inf in the input are not handled — clean them first.

These are deliberate v0.1.0 scope decisions. Open an issue if you have a concrete use case.

Citation

@inproceedings{yeh2016matrix,
  title={Matrix profile I: all pairs similarity joins for time series},
  author={Yeh, Chin-Chia Michael and others},
  booktitle={2016 IEEE 16th International Conference on Data Mining (ICDM)},
  year={2016}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastmatrix_profile-0.1.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastmatrix_profile-0.1.0-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file fastmatrix_profile-0.1.0.tar.gz.

File metadata

  • Download URL: fastmatrix_profile-0.1.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for fastmatrix_profile-0.1.0.tar.gz
Algorithm Hash digest
SHA256 089e3ba6e6190396b1f03c2547deda0258c9c2fb4f5f7e172de265d7e3d88716
MD5 e6efd09f81fb6e507425971d30ba33b8
BLAKE2b-256 24c74c42d7c2c31acc5e6d73a26cb977b070f21706923148d2981c9d99fed85f

See more details on using hashes here.

File details

Details for the file fastmatrix_profile-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fastmatrix_profile-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 43dd72380ef9a6d463b2138e5b57530149b1bc3627d8898180e59fc8e54f3f3f
MD5 db9d564bc593a90b441654db2b915cd0
BLAKE2b-256 24eac7c7b574c7f591f98853f907eb22247b2af91818416418feeafd5cf54188

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page