Skip to main content

Tool for connecting Prometheus to Zabbix monitoring

Project description

Promabbix

Connecting Prometheus to Zabbix.

Promabbix Logo

TL;DR

This tooling allows you to keep you favourite monitoring tool Zabbix and have a production-ready integration with Prometheus in a way similar to AlertManager. It consists of 2 parts:

  • Script to generate Zabbix Template from AlertManager-like definitions of Prometheus metrics
  • GitOps tooling allowing you to manage the vast majority of your Zabbix installation in a Infrastructure-as-a-Code (IaaC) pattern (To-be-done)

Usage

CLI Commands

Promabbix provides a command-line interface with multiple subcommands:

# Show available commands
promabbix --help

# Generate Zabbix template from alert configuration
promabbix generateTemplate <config-file> [OPTIONS]

# Available options for generateTemplate:
#   -o, --output PATH          Output file path (use "-" for STDOUT, default: -)
#   -t, --templates PATH       Custom template directory path
#   -tn, --template-name NAME  Template file name (default: prometheus_alert_rules_to_zbx_template.j2)
#   --validate-only           Only validate configuration without generating template

Local Development Usage

# Install dependencies
pip install -r requirements.txt -r requirements-dev.txt

# Validate configuration only
python -m promabbix.promabbix generateTemplate examples/minikube-alert-config.yaml --validate-only

# Generate template to file
python -m promabbix.promabbix generateTemplate examples/minikube-alert-config.yaml -o output.json

# Generate template to STDOUT
python -m promabbix.promabbix generateTemplate examples/minikube-alert-config.yaml

# Read from STDIN and output to STDOUT
cat examples/minikube-alert-config.yaml | python -m promabbix.promabbix generateTemplate -

Docker Usage

# Build the image
docker buildx build -t promabbix:local .
# or use image from DockerHub:
# docker pull promabbix/promabbix

# Get help
docker run promabbix:local --help
docker run promabbix:local generateTemplate --help

# Read data from STDIN and output to STDOUT
cat examples/minikube-alert-config.yaml | docker run -i promabbix:local generateTemplate - -o -

# Run with mounted directory and retrieve result file
docker run --mount type=bind,src=$(pwd)/examples/,dst=/mnt promabbix:local generateTemplate /mnt/minikube-alert-config.yaml -o /mnt/output.json

# Validate configuration only
docker run --mount type=bind,src=$(pwd)/examples/,dst=/mnt promabbix:local generateTemplate /mnt/minikube-alert-config.yaml --validate-only

Prometheus (VictoriaMetrics) to Zabbix Integration (simplified version of docs to understand the gist)

Our monitoring stack combines the Prometheus metrics with Zabbix in the following way:

  • zabbix-agents run on VMs and provide OS-level monitoring (mostly)
  • Prometheus scrapes Kubernetes clusters and other dynamic environments
    • (End-result is that all metrics are sent to a single large VictoriaMetrics instance)
  • we want all alerting and evaluation to be done in a single point that is Zabbix
  • alerts are pushed to Slack and PagerDuty

Given these technologies have incompatible data model, we've come up with a following process on how to combine them:

  • You define PromQL queries for your alerts (could be mostly taken from alertmanager templates without change)
    • The main difference is that we just take the alert calculation and leave out the threshold comparison - this way, we have a history for the value in Zabbix (and not just when the alert is triggered)
  • Queries results are filtered with LLD to create (static number of) pseudo-hosts
    • Typically we use one pseudo-host for all Prod envs, QA envs
  • Each pseudo-host will have items and triggers generated based on the definition
    • So that each time-series resulting from Query can be tracked separately
  • ansible based tooling transfers YAML definitions to live Zabbix instance (so you can track all the definitions in Git with IaaC principles)
    • Zabbix Templates are used behind-the-scenes, so the definitions can be maintained for a long time and kept in sync

Configuration Formats

Promabbix supports two configuration formats:

Unified Configuration Format (Recommended)

The modern approach uses a single YAML file containing all configuration sections:

# Single unified configuration file (e.g., service-alerts.yaml)
groups:
  - name: recording_rules
    rules:
      - record: elasticsearch_cluster_health_red
        expr: sum(elasticsearch_cluster_health_status{color="red"}) by (cluster)
  - name: alerting_rules  
    rules:
      - alert: elasticsearch_cluster_health_red_simple
        expr: elasticsearch_cluster_health_red > 0
        annotations:
          description: "Elasticsearch cluster {{$labels.cluster}} is in RED state"
        labels:
          __zbx_priority: "AVERAGE"

prometheus:
  api:
    url: "http://victoria-metrics:8428/api/v1/query"
  labels_to_zabbix_macros:
    - pattern: '\{\{(?:\s*)\$value(?:\s*)\}\}'
      value: "{ITEM.VALUE1}"
    - pattern: '\{\{(?:\s*)\$labels\.(?P<label>[a-zA-Z0-9\_\-]*)(?:\s*)\}\}'
      value: "{#\\g<label>}"

zabbix:
  template: service_elasticsearch_cluster
  name: "Template Module Prometheus Elasticsearch Cluster"
  hosts:
    - host_name: elasticsearch-cluster-prod
      visible_name: "Service Elasticsearch Cluster PROD"
      host_groups: ["Prometheus pseudo hosts", "Production hosts"]
      link_templates: ["templ_module_promt_service_elasticsearch_cluster"]
      macros:
        - macro: "{$ES.CLUSTER.LLD.MATCHES}"
          value: "^prod-(us|eu)$"
  macros:
    - macro: "{$ES.THRESHOLD}"
      value: "1"
      description: "Elasticsearch cluster health threshold"

# Optional: Documentation for alerts
wiki:
  knowledgebase:
    alerts:
      alertings:
        "elasticsearch_cluster_health_red_simple":
          title: "Elasticsearch cluster in RED state" 
          content: "See runbook for troubleshooting steps"

Legacy Multi-File Format (Backward Compatible)

The legacy approach splits configuration across multiple files:

Configuration definitions

Typically each directory is related to a single service and contains 3 types of files.

  • MYSERVICE_alerts.yaml - contains 2 parts:
    • recording_rules - these are being send to Prometheus/VictoriaMetrics backend for evaluation
    • alerting_rules - how are values from the previous point evaluated inside Zabbix
groups:
  - name: recording_rules
    rules:
      - record: elasticsearch_cluster_health_red
        # this expression is evaluated inside Prometheus/VictoriaMetrics
        expr: sum(elasticsearch_cluster_health_status{color="red"}) by (cluster)
  - name: alerting_rules
    rules:
      - alert: elasticsearch_cluster_health_red_simple
        # this experssion is a trigger rule inside of zabbix
        expr: elasticsearch_cluster_health_red > 0

      # multiple alerts can be used for the same metric
      - alert: elasticsearch_cluster_health_red_advanced
        # Zabbix macros can be also used
        expr: elasticsearch_cluster_health_red > {$ES.THRESHOLD}
        annotations:
          description: At least one member of Elasticsearch {{$labels.cluster}} is reporting RED state.  summary: Elasticsearch {{$labels.cluster}} (at least one node) is reporting RED state. 
        # Zabbix evaluation threshold
        for: 5m
        labels: # each alert can override general attributes
          __zbx_priority: "AVERAGE"
          grafana_dashboard: "p-abs/elasticsearch-clusters"
  • zabbix_vars.yaml The LLD variables and pseudo-hosts definition - there are generated in Zabbix as Hosts
zabbix:
  template: service_elasticsearch_cluster
  name: "Template Module Prometheus service Elasticsearch Cluster"
  lld_filters:
    filter:
      conditions:
        - formulaid: "A"
          macro: "{#CLUSTER}" # ! "elasticsearch" cluster name
          value: "{$ES.CLUSTER.LLD.MATCHES}"
      evaltype: "AND"
  macros:
    - macro: "{$ES.CLUSTER.LLD.MATCHES}"
      description: "This macro is used in metrics discovery. Can be overridden on the host or linked template level."
      value: ".*"
    - macro: "{$SLACK.ALARM.CHANNEL.NAME}"
      description: "Macro for define additional endpoint for alert"
      value: "myteam_alerts"
  tags:
    - tag: service_kind
      value: elasticsearch
  hosts:
    - host_name: elasticsearch-cluster-prod
      visible_name: Service Elasticsearch Cluster PROD
      host_groups:
        - Prometheus pseudo hosts
        - Production hosts
        - Elasticsearch
      link_templates:
        - templ_module_promt_service_elasticsearch_cluster
      status: enabled
      state: present
      proxy: zbx-pr02
      macros:
        - macro: "{$ELASTICSEARCH.CLUSTER.LLD.MATCHES}"
          value: "^prod-(us|eu)$"
    - host_name: elasticsearch-cluster-qa
      visible_name: Service Elasticsearch Cluster QA
      host_groups:
        - Prometheus pseudo hosts
        - QA hosts
        - Elasticsearch
      link_templates:
        - templ_module_promt_service_elasticsearch_cluster
      status: enabled
      state: present
      proxy: zbx-pr02
      macros:
        - macro: "{$ELASTICSEARCH.CLUSTER.LLD.MATCHES}"
          value: "^(qa-.*)$"

  # you can customize alert messages
  labels:
    slack_alarm_channel_name: '{$SLACK.ALARM.CHANNEL.NAME:"{{$labels.job}}"}'
    grafana_dashboard: 'L-abc/elasticsearch-clusters'
  • wiki_vars.yaml And you can define docs the the alert is trigger. This is passed to Slack/PagerDuty, so the duty person has the links to all diagnostics tools and typically some hint (redbook) on how to manage the particular alert
wiki:
  templates:
    wrike_alert_config:
      templates:
        - name: 'elasticsearch cluster'
          title: 'Alerts regarding Elasticsearch clusters'
  knowledgebase:
    alerts:
      alertings:
        "elasticsearch_cluster_health_red_simple":
          title: "Elasticsearch cluster in RED state"
          content: |
            {{toc/}}
            # Overview
            This documentation generated from YAML file.
            # How to investigate the case
            * Go to Grafana dashboard, [link](https://grafana.company.net/d/L-abc/elasticsearch-clusters).
            ...

Examples

  • Generated pseudo-hosts in Zabbix

Hosts

Host - details

Host - macros

  • LLD rules for the discovery on a template

Template LLD

  • Zabbix Items collected on a host
    • There is one time for executing a query to Prometheus/VictoriaMetrics
    • For each time-series result, there is a separate Item tracking that particular combination of labels (and alert is connected to this Item)

Items

Items - collection

Items - regular

  • Item history can be visible in Zabbix

Items - graph

  • When you receive an Alert in Slack, there are links to better align with a response

Alert - slack

  • If you receive an Alert via PagerDuty (mostly SysOps) - there is a clear connection to Zabbix Trigger (and Item triggering it)

Alert - slack

Alert - zabbix event

  • It is also possible to setup Zabbix Maintenance based on the Tags - so you won't receive the alerts (both in Slack and PagerDuty)

Maintenance

Limitations

  • You need to use aggregate functions (max, sum etc.), otherwise the metrics are typically flaky or might lead to explosion of items in Zabbix (comes out of a lot of experience here)
  • There are edge cases when the evaluation or logic is just different between Zabbix and Prometheus (typically for: clause)

Compatibility Zabbix 6.0+

Development

Project Structure

The project follows a modular architecture:

src/promabbix/
├── cli/                    # CLI command implementations
│   ├── generate_template.py   # generateTemplate command
│   └── __init__.py
├── core/                   # Core business logic modules
│   ├── fs_utils.py            # File system operations (DataLoader, DataSaver)
│   ├── template.py            # Jinja2 template rendering
│   ├── validation.py          # Configuration validation
│   ├── migration.py           # Legacy format migration
│   └── data_utils.py          # Data validation utilities
├── schemas/               # JSON/YAML schemas
│   └── unified.yaml          # Unified configuration schema
├── templates/             # Jinja2 templates
│   └── prometheus_alert_rules_to_zbx_template.j2
└── promabbix.py          # Main CLI entry point

Local Development Setup

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt -r requirements-dev.txt

# Install additional type checking dependencies
pip install mypy types-PyYAML

# Run the CLI locally
PYTHONPATH=src python -m promabbix.promabbix generateTemplate --help

Running Tests

The project includes comprehensive unit tests covering all functionality with 92% code coverage.

# Run all tests
python -m pytest tests/ -v

# Run tests with coverage
python -m pytest tests/ -v --cov=src/promabbix --cov-report=term-missing

# Run tests with coverage requirement (80% minimum)
python -m pytest tests/ -v --cov-fail-under=80 --cov=src/promabbix --cov-report=term-missing --cov-report=xml

# Run using the test runner script (recommended)
python run_tests.py

Code Quality Checks

The project maintains high code quality standards:

# Run linting
flake8 src/ --count --max-complexity=10 --max-line-length=127 --statistics

# Run type checking
mypy src/ --ignore-missing-imports

# All quality checks together (CI pipeline)
python run_tests.py && flake8 src/ --count --max-complexity=10 --max-line-length=127 --statistics && mypy src/ --ignore-missing-imports

Architecture Overview

  • CLI Layer: Click-based commands with proper argument parsing and help
  • Business Logic: Core modules for validation, template rendering, and file operations
  • Dependency Injection: Testable architecture with mockable dependencies
  • Error Handling: Rich console output with user-friendly error messages
  • Type Safety: Full type annotations and mypy compliance

See tests/README.md for detailed information about the test suite.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promabbix-0.1.0.tar.gz (4.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promabbix-0.1.0-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file promabbix-0.1.0.tar.gz.

File metadata

  • Download URL: promabbix-0.1.0.tar.gz
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promabbix-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a71acadb2bd035f7c36d781ebea5d98b098c23a56ab728fb0ec256ef982644e9
MD5 58fc627e8e9f8c180ac76b7c29311e66
BLAKE2b-256 6b4b880ce9b32b257db5d0d22803d5f0a4e342150dfef829aa9c9b5287618e19

See more details on using hashes here.

Provenance

The following attestation bundles were made for promabbix-0.1.0.tar.gz:

Publisher: pypi-publish.yml on wrike/promabbix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file promabbix-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: promabbix-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promabbix-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 192b5cbd700e275c06988d877a4c336a03343b0bc487ca4cfd707e9055e688cd
MD5 fb00e7cd48185bf71cd3aa289e515605
BLAKE2b-256 804429ba8f92fcd5b307bffef93b3a6d98182b1fc700c92f843cb0f7fa86efc9

See more details on using hashes here.

Provenance

The following attestation bundles were made for promabbix-0.1.0-py3-none-any.whl:

Publisher: pypi-publish.yml on wrike/promabbix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page