Prometheus time series anomaly detection with LSTM Autoencoder

These details have not been verified by PyPI

Project links

Homepage

Project description

Prometheus Time Series Anomaly Detection with LSTM Autoencoder

This project implements a system for detecting anomalies in time series data collected from Prometheus. It uses an LSTM (Long Short-Term Memory) autoencoder model built with TensorFlow/Keras to learn normal patterns from your metrics and identify deviations. The system includes scripts for data collection, preprocessing, model training, data filtering, and real-time anomaly detection, exposing results via a Prometheus exporter.

Features

Data Collection: Fetches time series data from a Prometheus instance for specified PromQL queries.
Preprocessing: Handles missing values and normalizes/scales data for optimal model training.
LSTM Autoencoder Training: Trains an LSTM autoencoder on the full preprocessed dataset.
Data Filtering: A script to apply the trained Model A to filter out anomalous sequences from a dataset.
Real-time Anomaly Detection: Continuously monitors new data and processes it with the trained model to detect anomalies.
Prometheus Exporter Integration: Exposes key anomaly detection metrics (e.g., reconstruction error, anomaly flag, per-feature errors) that can be scraped by Prometheus and monitored with tools like Grafana.
Configurable: All stages are highly configurable via a central config.yaml file.

example graphics

Project Structure

.
├── config.yaml                 # Central configuration file for all scripts
├── data_collector.py           # Script to collect historical data from Prometheus
├── preprocess_data.py          # Script to preprocess the collected data
├── train_autoencoder.py        # Script to train the LSTM autoencoder
├── filter_anomalous_data.py    # Script to filter data using the trained model to separate normal/anomalous sequences
├── realtime_detector.py        # Script for real-time anomaly detection and Prometheus exporter
├── Pipfile                     # Dependency declarations
├── Pipfile.lock                # Locked versions of dependencies
└── README.md                   # This file

Prerequisites

Python 3.12
Pipenv for managing dependencies
A running Prometheus instance (v2.x or later) that is scraping the metrics you want to analyze.
(Optional) Exporters configured for your Prometheus to collect the desired metrics (e.g., node_exporter, windows_exporter).

Setup & Installation

Clone the Repository:

git clone <your-repository-url>
cd <repository-name>

Install Dependencies with Pipenv:
```
pipenv install --dev
```
After installation you can enter the environment using pipenv shell or run scripts with pipenv run.
Prometheus Setup: Ensure your Prometheus server is running and accessible. The scripts will query this server based on the URL and PromQL queries defined in config.yaml. The example queries in config.yaml might use metrics from windows_exporter; adapt these to your own available metrics.

Configuration (`config.yaml`)

The config.yaml file is central to running this project. Key sections include:

prometheus_url: URL of your Prometheus server.
queries: Dictionary of PromQL queries with friendly aliases.
data_settings: Parameters for data_collector.py (e.g., collection_period_hours, step, output_filename).
preprocessing_settings: Parameters for preprocess_data.py (e.g., nan_fill_strategy, scaler_type, processed_output_filename, scaler_output_filename).
training_settings: Parameters for train_autoencoder.py.
- model_output_filename: Filename for Model A (trained on all data).
- sequence_length, train_split_ratio, epochs, batch_size, learning_rate, early_stopping_patience: Standard training hyperparameters.
- lstm_units_encoder1, etc.: LSTM autoencoder architecture definition.
data_filtering_settings: Parameters for filter_anomalous_data.py.
- normal_sequences_output_filename: Output file for sequences classified as normal by Model A.
- anomalous_sequences_output_filename: Output file for sequences classified as anomalous by Model A.
real_time_anomaly_detection: Parameters for realtime_detector.py.
- query_interval_seconds: How often to fetch new data.
- anomaly_threshold_mse: Crucial! MSE threshold for declaring an anomaly. Tune this based on validation error histograms.
- exporter_port: Port for the Prometheus exporter.
- metrics_prefix: Prefix for exposed Prometheus metrics.

Before running any script, review and customize config.yaml thoroughly.

Usage / Workflow

The project follows a sequential workflow. Each stage can also be launched via the cli.py utility:

python cli.py collect       # сбор данных
python cli.py preprocess    # предобработка
python cli.py train         # обучение модели
python cli.py detect        # запуск realtime детектора

The sequential workflow remains as follows:

Step 1: Data Collection (data_collector.py) Collect historical data from your Prometheus instance.

python data_collector.py

Output: Raw data Parquet file (e.g., prometheus_metrics_data.parquet).

Step 2: Data Preprocessing (preprocess_data.py) Preprocess the collected data (handles NaNs, scales features).

python preprocess_data.py

Outputs: Processed data Parquet file (e.g., processed_metrics_data.parquet) and a saved scaler (e.g., fitted_scaler.joblib).

Step 3: Train Initial Model - Model A (train_autoencoder.py) Train the first LSTM autoencoder (Model A) on the full preprocessed dataset.

In config.yaml (training_settings):
- Set train_on_filtered_sequences: false.
- Configure model_output_filename (e.g., lstm_autoencoder_model_A.keras).

python train_autoencoder.py

Outputs: Trained Model A (e.g., lstm_autoencoder_model_A.keras), training history plots. Use the reconstruction_error_histogram_...A.png to help determine anomaly_threshold_mse in config.yaml.

Step 4: Filter Data (Optional) (filter_anomalous_data.py) Use the trained Model A to classify sequences in your preprocessed dataset as "normal" or "anomalous".

Ensure anomaly_threshold_mse (from real_time_anomaly_detection section, used by this script as the threshold for Model A) is appropriately set in config.yaml.
Configure output filenames in data_filtering_settings.

python filter_anomalous_data.py

Outputs: .npy files for normal sequences (e.g., normal_sequences.npy) and anomalous sequences.

Step 5: Real-time Anomaly Detection (realtime_detector.py) Run the real-time detector using the trained model.

Ensure model_output_filename in training_settings points to your trained model.
Ensure anomaly_threshold_mse in real_time_anomaly_detection is correctly set.

python realtime_detector.py

The detector starts a Prometheus exporter (e.g., on http://localhost:8001/metrics).

Step 6: Monitoring (Prometheus & Grafana) Configure Prometheus to scrape the metrics endpoint from realtime_detector.py. Visualize metrics like:

anomaly_detector_latest_reconstruction_error_mse
anomaly_detector_is_anomaly_detected
anomaly_detector_total_anomalies_count_total
anomaly_detector_feature_reconstruction_error_mse{feature_name="your_alias"}

Interpreting Results

Monitoring Metrics: Observe the is_anomaly_detected and latest_reconstruction_error_mse metrics in real time to evaluate detection behavior.
Per-Feature Errors: When an anomaly is flagged by either model, check the corresponding feature_reconstruction_error_mse metrics (and logs of realtime_detector.py) to see which specific time series (features) are contributing most to the anomaly.

Customization & Extending

Monitoring New Metrics: Add PromQL queries to config.yaml. Retrain models (all relevant steps) to include these.
Tuning Anomaly Thresholds: The anomaly_threshold_mse value is critical. Adjust it based on model performance and desired sensitivity.
Model Architecture: Modify LSTM parameters in training_settings of config.yaml.
Experimentation: Use the filter_anomalous_data.py script with different thresholds to generate various "cleaned" datasets if needed.

Troubleshooting

Python Dependencies: Ensure Pipfile/Pipfile.lock are in sync and run pipenv install if packages change.
Prometheus Connection: Verify prometheus_url and query validity.
Data Issues: Check for "No data found" errors; inspect PromQL queries and Prometheus scrape targets. Review nan_fill_strategy if NaNs persist.
Model Training: If loss doesn't decrease, adjust learning rate, batch size, or architecture. For overfitting, utilize EarlyStopping or consider more data/regularization.
File Not Found: Double-check filenames in config.yaml against actual generated files (models, scalers, datasets).
Port in Use: If realtime_detector.py fails, the exporter_port might be occupied.

Contributing

Contributions are welcome! Please feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License.

Publishing to PyPI

This project includes a GitHub Actions workflow that builds and uploads the package to PyPI when a tag starting with v is pushed. Set the PYPI_API_TOKEN secret in your repository settings.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.6

Jun 7, 2025

0.2.5

Jun 7, 2025

0.2.4

Jun 7, 2025

0.2.3

Jun 7, 2025

0.2.1

Jun 7, 2025

0.2.0

Jun 7, 2025

0.1.9

Jun 7, 2025

0.1.8

Jun 6, 2025

0.1.7

Jun 6, 2025

0.1.6

Jun 6, 2025

This version

0.1.4

Jun 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prometheus_anomaly_detection_lstm-0.1.4.tar.gz (25.7 kB view details)

Uploaded Jun 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

prometheus_anomaly_detection_lstm-0.1.4-py3-none-any.whl (26.2 kB view details)

Uploaded Jun 6, 2025 Python 3

File details

Details for the file prometheus_anomaly_detection_lstm-0.1.4.tar.gz.

File metadata

Download URL: prometheus_anomaly_detection_lstm-0.1.4.tar.gz
Upload date: Jun 6, 2025
Size: 25.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for prometheus_anomaly_detection_lstm-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`d1dbbd131fc5978fc17031358296241dba538f60e942dea352e6c7d696105ba4`
MD5	`06d990b31d383fb14dc93e6673f07a98`
BLAKE2b-256	`db3ced242337efaefee01c41691b10801a5bbe1ab91b9e39281eb692bc7be20f`

See more details on using hashes here.

File details

Details for the file prometheus_anomaly_detection_lstm-0.1.4-py3-none-any.whl.

File metadata

Download URL: prometheus_anomaly_detection_lstm-0.1.4-py3-none-any.whl
Upload date: Jun 6, 2025
Size: 26.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for prometheus_anomaly_detection_lstm-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0519b80bafcea79e8ccf8b49762c02558f25a4968e93bcc2a8cf336cfe24f647`
MD5	`d7c6bf061b7493ca82c3c3e26c17ccea`
BLAKE2b-256	`516c2113bc46093d2e49f6edd81c66213f9d851824d56bfbfb2cdeaeb28432e0`

See more details on using hashes here.

prometheus-anomaly-detection-lstm 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Prometheus Time Series Anomaly Detection with LSTM Autoencoder

Features

Project Structure

Prerequisites

Setup & Installation

Configuration (`config.yaml`)

Usage / Workflow

Interpreting Results

Customization & Extending

Troubleshooting

Contributing

License

Publishing to PyPI

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

prometheus-anomaly-detection-lstm 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Prometheus Time Series Anomaly Detection with LSTM Autoencoder

Features

Project Structure

Prerequisites

Setup & Installation

Configuration (config.yaml)

Usage / Workflow

Interpreting Results

Customization & Extending

Troubleshooting

Contributing

License

Publishing to PyPI

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configuration (`config.yaml`)