Skip to main content

Pure-numpy Matrix Profile (motif and anomaly discovery in time series). No Numba required.

Project description

fastmatrix-profile

Pure-numpy Matrix Profile for time-series motif and anomaly detection. No Numba. No C extensions. No GPU. Just numpy + scipy.

Status: v0.1.0 — 29/29 tests passing, output bit-identical to brute-force reference up to n=10k, NYC Taxi tutorial recovers 4/5 documented anomalies.

What it does

Given a 1D time series T and a window length m, computes the Matrix Profile P — a 1D summary where:

  • argmin(P) points to a repeated pattern (motif)
  • argmax(P) points to the most unusual subsequence (anomaly / discord)

One call, no tuning, no labels.

Install

pip install fastmatrix-profile

30-second example

import numpy as np
from fastmatrix_profile import matrix_profile

T = np.loadtxt("your_timeseries.csv")
P, I = matrix_profile(T, m=100)

anomaly_idx = int(P.argmax())
motif_idx   = int(P.argmin())

That's it. See examples/nyc_taxi_anomalies.ipynb for a real walk-through on the Numenta NAB NYC taxi dataset — recovers the NYC Marathon, Christmas, New Year and the Jan 2015 snowstorm without any parameter tuning beyond m.

Benchmarks vs STUMPY

Numbers below: macOS arm64 (Apple M4) + Accelerate BLAS. Also verified on Linux x86 + OpenBLAS (GitHub Actions CI — see bench.yml) and Linux arm64 + OpenBLAS (Docker container). The relative competitive story holds across all three; on the 2-core x86 Linux runner, the f32 mode ties-or-beats STUMPY in 12 of 14 configurations. Run the interactive demo yourself to see numbers on your own hardware.

Warm wall-time (median of 3 reps after warmup)

n m fastmatrix-profile (default) + dtype="float32" STUMPY winner
1,000 50 1.4 ms 1.0 ms 7.4 ms ours, 5–7×
2,000 50 6.4 ms 4 ms 11.1 ms ours, 1.7–2.5×
5,000 100 39 ms 19 ms 27 ms f32 beats STUMPY (1.4×)
10,000 100 136 ms 72 ms 62 ms STUMPY (2.2× / 1.15×)
20,000 100 590 ms 263 ms 175 ms STUMPY (3.4× / 1.5×)

dtype="float32" is opt-in: ~2× faster at the cost of ~3e-5 absolute error in P (argmin/argmax positions unchanged in practice — verified on the NYC taxi dataset, both modes flag identical anomalies).

Cold start (fresh Python process, n=2000, m=50)

Time
fastmatrix-profile 0.17 s
STUMPY 8.6 s (49× slower — Numba JIT compile)

When to use this vs STUMPY

Situation Use
Long series (n > ~10k), Numba installs fine, many calls per process STUMPY
Lambda / edge / Docker-slim / Pyodide / locked-down corporate Python fastmatrix-profile
One-shot calls or interactive notebooks where cold start matters fastmatrix-profile
GPU available, very large n STUMPY-CUDA or SCAMP

This is not a STUMPY replacement at large n. It's the option for when you can't or don't want to install a JIT toolchain.

Algorithm

Uses a row-tiled BLAS-batched self-join: form the z-normalised window matrix Wz of shape (L, m), then sweep row blocks of Wz, multiply each block against Wz.T in a single gemm call, mask the trivial-match band, reduce to per-row min, and discard the tile. Peak memory: tile × L floats per tile (tens of MB) instead of the GBs a full (L, L) Gram matrix would need. Cost: O(L² m) flops at near-peak BLAS throughput.

Falls back to the STOMP recurrence (Zhu et al., ICDM 2016) when even a minimum-size tile would exceed max_mem_mb (default 512 MB) — typically n ≳ 1M. You can call matrix_profile(very_long_series, m) without blowing up RAM; it will just be slower past the dense regime.

dtype="float32" runs the inner GEMM in single precision: ~2× faster, ~half the peak memory, ~3e-5 absolute error in P (argmin/argmax positions unchanged in practice).

mass(Q, T) is exposed as a standalone primitive (Mueen's MASS distance profile via FFT).

Limitations

  • 1D series only. No multi-dimensional matrix profile.
  • No streaming / incremental updates.
  • No GPU.
  • NaN/Inf in the input are not handled — clean them first.

These are deliberate v0.1.0 scope decisions. Open an issue if you have a concrete use case.

Citation

@inproceedings{yeh2016matrix,
  title={Matrix profile I: all pairs similarity joins for time series},
  author={Yeh, Chin-Chia Michael and others},
  booktitle={2016 IEEE 16th International Conference on Data Mining (ICDM)},
  year={2016}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastmatrix_profile-0.1.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastmatrix_profile-0.1.1-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file fastmatrix_profile-0.1.1.tar.gz.

File metadata

  • Download URL: fastmatrix_profile-0.1.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for fastmatrix_profile-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6df85f846ec859701b4d2839d87c1392b7e492bc8fbecf501e83163beda390cb
MD5 4a189a8f79af8f0d46b622083d953417
BLAKE2b-256 4c8488e77e3f4fa167beb95f03071bafa24e182c1fcbf162fda835214f2d950e

See more details on using hashes here.

File details

Details for the file fastmatrix_profile-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for fastmatrix_profile-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0959d327e8b4fc017f39f26face376e5b11f425d760850b5dabe23d80e3dc3cc
MD5 5d57e74b6eae6ef018f107d337f55ea7
BLAKE2b-256 08dbae6a1c6325b4ee56ec2341d28acc8d22930604c17a9f01e572f34ed1d94d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page