Skip to main content

Evaluation measures for time series segmentation

Project description

tsseg-eval - Time Series Segmentation Evaluation

Toward Interpretable Evaluation Measures for Time Series Segmentation, NeurIPS'25

arXiv PyPI version License Python Version Downloads
GitHub issues GitHub stars

Time series segmentation is a fundamental task in analyzing temporal data across various domains, from human activity recognition to energy monitoring. While numerous state-of-the-art methods have been developed to tackle this problem, the evaluation of their performance remains critically limited. Existing measures predominantly focus on change point accuracy or rely on point-based metrics such as Adjusted Rand Index (ARI), which fail to capture the quality of the detected segments, ignore the nature of errors, and offer limited interpretability. In this paper, we address these shortcomings by introducing two novel evaluation measures: WARI (Weighted Adjusted Rand Index), a temporal extension of ARI that accounts for the position of segmentation errors, and SMS (State Matching Score), a fine-grained metric that identifies and scores four distinct and fundamental types of segmentation errors while allowing error-specific weighting. We empirically validate WARI and SMS on synthetic and real-world benchmarks, showing that they not only provide a more accurate assessment of segmentation quality but also uncover insights, such as error provenance and type, that are inaccessible with traditional measures.

References

If you use SMS or WARI in your project or research, please cite the following paper:

"Toward Interpretable Evaluation Measures for Time Series Segmentation"
Félix Chavelli, Paul Boniol, Michaël Thomazo
The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS'25)

@inproceedings{
chavelli2025toward,
title={Toward Interpretable Evaluation Measures for Time Series Segmentation},
author={F{\'e}lix Chavelli and Paul Boniol and Micha{\"e}l Thomazo},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://openreview.net/forum?id=Gz6dujD5j0}
}

How to use SegEval

Option 1: Use the metrics (PyPI package)

To use the evaluation measures (SMS, WARI, etc.) in your own project, you can install the (beta) lightweight package via pip:

$ pip install tsseg-eval

Option 2: Reproduce the paper results

To reproduce the experiments and results presented in the paper, you need to clone this repository and install the full environment.

We recommend using a Python>3.9 virtual environment with the following dependencies.

git clone https://github.com/fchavelli/tsseg-eval.git
cd tsseg-eval
conda env create -f environment.yml
conda activate tsseg-eval

Usage

import numpy as np
import pandas as pd
from claspy.segmentation import BinaryClaSPSegmentation
from tsseg_eval import f1, covering, nmi, ari, wari, sms

def run_clasp(time_series):
    start_time = time.time()
    clasp = BinaryClaSPSegmentation()
    clasp.fit_predict(time_series)
    change_points = clasp.change_points.tolist()
    end_time = time.time()
    elapsed_time = end_time - start_time
    return change_points, elapsed_time

# Run a segmentation method (Clasp here)
prediction, elapsed_time = run_clasp(data)


# For Change Point Detection
# Compute F-score
f1_score = f1(groundtruth, prediction)

# Compute coverage
cov_score = covering(groundtruth, prediction)

# For State Detection
# Compute Normalized Mutual Information
nmi_score = nmi(groundtruth, prediction)

# Compute Adjusted Rand Index
ari_score = ari(groundtruth, prediction)

# Compute Weighted Adjusted Rand Index
wari_score = wari(groundtruth, prediction)

# Compute State Matching Score
sms_score = sms(groundtruth, prediction)

Reproduce the Paper

Dataset Preparation

You can download the datasets used in the paper from the following links:

Dataset Type Download Link
MoCap Real-world download
ActRecTut Real-world download
PAMAP2 Real-world download
UscHad Real-world download
UcrSeg Real-world download

After downloading the datasets, move them to the '\data' directory, ensuring the following directory structure:

.
├── data
│   ├── ActRecTut
│   │   ├── subject1_walk
│   │   │   ├── S111.dat
│   │   │   ├── ...
│   │   ├── subject2_walk
│   │   │   ├── S111.dat
│   │   │   ├── ...
│   ├── MoCap
│   │   ├── 4d
│   │   │   ├── amc_86_01.4d
│   │   │   ├── ...
│   │   ├── raw
│   │   │   ├── amc_86_01.txt
│   │   │   ├── ...
│   ├── PAMAP2
│   │   ├── Protocol
│   │   │   ├── subject101.dat
│   │   │   ├── ...
│   ├── USC-HAD
│   │   ├── Subject1
│   │   ├── Subject2
│   │   ├── ...
│   ├── UCRSEG
│   │   ├── Cane_100_2345.txt
│   │   ├── DutchFactory_24_2184.txt
│   │   ├── ...

Reproduce the experimental results

Run main experiment (flag multivariate also includes univariate dataset)

python src/experiments.py --t multivariate

Evaluate the algorithms

python src/evaluation.py multivariate

Code for some additional experiments is available in src folder.

Results will be saved in a results/ directory

Acknowledgements

This work leverages the E2USd implementation as its foundation.

Our gratitude extends to the authors of the following studies for making their datasets publicly available:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsseg_eval-0.1.4.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tsseg_eval-0.1.4-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file tsseg_eval-0.1.4.tar.gz.

File metadata

  • Download URL: tsseg_eval-0.1.4.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for tsseg_eval-0.1.4.tar.gz
Algorithm Hash digest
SHA256 5e1647ae5bbbb29cf61b01b713d12a348c78f02bfc995c224be3bd209d16c5a7
MD5 4b5a859d467e39ed9a98a142e2acfc51
BLAKE2b-256 b32e16825adef741f68ad750cf10f53c32c3a8cb5149a5b9cf8dcf3c4b519132

See more details on using hashes here.

File details

Details for the file tsseg_eval-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: tsseg_eval-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for tsseg_eval-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 111550a5f483ca06d70bc28a4cff1fcbecc0c1ed3f0f72c73a086cc5ae971f5a
MD5 5f971139af3759e0ae7dcd157fd257c7
BLAKE2b-256 ded241f357bf851994c7a4af2766174d8f9a75d2f439258e93b4e5867747344a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page