Feature Reduction for Multivariate Time Series Data
Project description
MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications
This repository contains code and datasets used in the experiments described in our paper [1].
- [1]: Yuuki Tsubouchi, Hirofumi Tsuruta, "MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications", IEEE Access (ACCESS) , Vol. 12, pp. 37398-37417, March 2024.
Introduction
MetricSifter is a feature reduction framework designed to accurately identify anomalous metrics caused by faults for enhancing fault localization. Our key insight is that the change point times inside the failure duration are close to each other for the failure-related metrics. MetricSifter detects change points per metric, localizes the time frame with the highest change point density, and excludes metrics with no change points in that time frame. The offline change point detection is implemented by ruptures, and the segmentation of the detected change points is based on kernel density estimation (KDE).
Installation
Prerequisites
If you want to use uv (recommended for faster installation), install it first:
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or using pip
pip install uv
From PyPI
You can install metricsifter package from PyPI:
# Using pip
pip install metricsifter
# Using uv (recommended for faster installation)
uv pip install metricsifter
For Development
Note: The core package supports Python 3.10-3.14.
# Clone the repository
git clone https://github.com/ai4sre/metricsifter.git
cd metricsifter
# Using uv (recommended)
uv sync --all-extras
# Or using pip
pip install -e ".[dev]"
For running experiments (requires Python 3.10 or 3.11):
The experiments require sfr-pyrca, which must be installed separately as it's not available on PyPI:
# Install sfr-pyrca from GitHub (Python 3.10 or 3.11 only)
pip install git+https://github.com/salesforce/PyRCA@d85512b
Getting Started
from metricsifter.sifter import Sifter
from tests.sample_gen.generator import generate_synthetic_data
## Create time series data
normal_data, abonormal_data, _, _, anomalous_nodes = generate_synthetic_data(num_node=20, num_edge=20, num_normal_samples=55, num_abnormal_samples=15, anomaly_type=0)
data = pd.concat([normal_data, abonormal_data], axis=0, ignore_index=True)
## Remove the variables of time series data
sifter = Sifter(penalty_adjust=2.0, n_jobs=1)
sifted_data = sifter.run(data=data)
print("(#removed metrics) / (#total metrics):", len(set(data.columns) - set(siftered_data.columns)), "/", len(data.columns))
print("difference between prediction and ground truth:", set(siftered_data.columns) - anomalous_nodes)
assert set(sifted_data.columns) - anomalous_nodes == set()
The example of original synthetic data and its sifted data is shown in the following figure.
Before
After
For Developers
Setup Development Environment
# Using uv (recommended)
uv sync --all-extras
# Or using pip
pip install -e ".[dev]"
# For experiments (Python 3.10 or 3.11 only)
pip install git+https://github.com/salesforce/PyRCA@d85512b
Run Tests
pytest -s -v tests
Code Quality
# Format code
black .
# Lint code
ruff check .
Publishing to PyPI
This package uses GitHub Actions to automatically publish to PyPI when a new tag is pushed.
Publishing Process
-
Update version in pyproject.toml
# Edit the version field version = "0.0.2" # Increment as needed
-
Commit and tag the release
git add pyproject.toml git commit -m "Bump version to 0.0.2" git tag v0.0.2 git push origin main git push origin v0.0.2
-
Automatic Publication
- The GitHub Actions workflow will automatically:
- Build the package using
uv build - Publish to TestPyPI (for testing)
- Publish to PyPI (production)
- Build the package using
- The GitHub Actions workflow will automatically:
Setup Requirements
For the workflow to work, you need to configure Trusted Publishing in PyPI:
- Go to PyPI and TestPyPI
- Create/login to your account
- Go to your account settings → Publishing
- Add a new Trusted Publisher with:
- PyPI project name:
metricsifter - Owner:
ai4sre - Repository name:
metricsifter - Workflow name:
publish.yaml - Environment name:
pypi(for PyPI) ortestpypi(for TestPyPI)
- PyPI project name:
Note: Trusted Publishing uses OpenID Connect (OIDC) and doesn't require manual API tokens.
Local Build Testing
To test the build locally before publishing:
# Build the package
uv build
# The built files will be in the dist/ directory:
# - metricsifter-X.Y.Z.tar.gz (source distribution)
# - metricsifter-X.Y.Z-py3-none-any.whl (wheel)
Manual Publishing (Alternative)
If you prefer to publish manually:
# Build the package
uv build
# Publish to TestPyPI (for testing)
uv publish --publish-url https://test.pypi.org/legacy/
# Publish to PyPI (production)
uv publish
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metricsifter-0.0.1.tar.gz.
File metadata
- Download URL: metricsifter-0.0.1.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3d13b600b9bc5aac257cb8c6f459cc504a6644c0ed4c0b14e71fe9718b45314
|
|
| MD5 |
f998d49bbe65a92034c998c0335144bc
|
|
| BLAKE2b-256 |
cd9d9c7858e1c5e9405ca3803bd90134b4c2bcbb5c320de501532219b9999ff0
|
Provenance
The following attestation bundles were made for metricsifter-0.0.1.tar.gz:
Publisher:
publish.yaml on ai4sre/metricsifter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metricsifter-0.0.1.tar.gz -
Subject digest:
e3d13b600b9bc5aac257cb8c6f459cc504a6644c0ed4c0b14e71fe9718b45314 - Sigstore transparency entry: 825351323
- Sigstore integration time:
-
Permalink:
ai4sre/metricsifter@30c1c1bef051a25bcb3b0e9626f82a17e4edc903 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/ai4sre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@30c1c1bef051a25bcb3b0e9626f82a17e4edc903 -
Trigger Event:
push
-
Statement type:
File details
Details for the file metricsifter-0.0.1-py3-none-any.whl.
File metadata
- Download URL: metricsifter-0.0.1-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc036d4fddedfdba94887ac4d1ba155a733fb955c306c8848c5b499130dbe298
|
|
| MD5 |
995da77d8e26bbf240cb0878b6ad45b8
|
|
| BLAKE2b-256 |
176658d0865c1bb512883dc48a7dfc6ffbb88a1f96ea6b21ea803b2ce1193558
|
Provenance
The following attestation bundles were made for metricsifter-0.0.1-py3-none-any.whl:
Publisher:
publish.yaml on ai4sre/metricsifter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
metricsifter-0.0.1-py3-none-any.whl -
Subject digest:
bc036d4fddedfdba94887ac4d1ba155a733fb955c306c8848c5b499130dbe298 - Sigstore transparency entry: 825351380
- Sigstore integration time:
-
Permalink:
ai4sre/metricsifter@30c1c1bef051a25bcb3b0e9626f82a17e4edc903 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/ai4sre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@30c1c1bef051a25bcb3b0e9626f82a17e4edc903 -
Trigger Event:
push
-
Statement type: