Skip to main content

Read-only Prometheus MCP server for metrics querying, alerting, and monitoring inspection

Project description

Prometheus MCP Server

A read-only Model Context Protocol (MCP) server for Prometheus, enabling AI agents like Claude to query metrics, investigate alerts, and analyze monitoring data safely.

Python 3.12+ MCP License: MIT

Features

  • Read-Only Safety: All operations are read-only. No modifications to Prometheus configuration or data
  • Comprehensive Query Support: Execute PromQL instant and range queries
  • Metadata Discovery: List metrics, labels, and series
  • Target Monitoring: View scrape targets and their health
  • Alert Investigation: List active alerts and alert rules
  • Configuration Inspection: View Prometheus configuration and runtime info
  • Multiple Auth Methods: Support for Bearer tokens and Basic authentication
  • Type-Safe: Full type hints for Python 3.12+

Installation

Using uv (Recommended)

# Clone the repository
git clone <repository-url>
cd prometheus-mcp-server

# Install with uv
uv pip install -e .

# Or install from PyPI (when published)
uv pip install prometheus-mcp-server

Using pip

pip install prometheus-mcp-server

Using pipx (for CLI usage)

pipx install prometheus-mcp-server

Quick Start

1. Set Environment Variables

# Prometheus server URL (required)
export PROM_URL="http://localhost:9090"

# Optional: Bearer token authentication
export PROM_TOKEN="your_bearer_token"

# Optional: Basic authentication
export PROM_USERNAME="admin"
export PROM_PASSWORD="secret"

# Optional: Timeout and SSL settings
export PROM_TIMEOUT="30"
export PROM_VERIFY_SSL="true"

2. Run the Server

# Using stdio transport (default, for Claude Desktop)
prometheus-mcp-server

# Using HTTP transport
prometheus-mcp-server --transport http --port 8000

# With custom Prometheus URL
prometheus-mcp-server --url https://prometheus.example.com

3. Configure with Claude Desktop

Add to your Claude Desktop configuration (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "prometheus": {
      "command": "prometheus-mcp-server",
      "env": {
        "PROM_URL": "http://localhost:9090",
        "PROM_TOKEN": "optional_bearer_token"
      }
    }
  }
}

Or with uvx:

{
  "mcpServers": {
    "prometheus": {
      "command": "uvx",
      "args": ["prometheus-mcp-server"],
      "env": {
        "PROM_URL": "http://localhost:9090"
      }
    }
  }
}

Available Tools

Query Tools

query_instant

Execute a PromQL instant query at a single point in time.

# Check which targets are up
query_instant(query="up")

# Get API server status
query_instant(query='up{job="api-server"}', time="now")

# Get current request rate
query_instant(query="rate(http_requests_total[5m])")

query_range

Execute a PromQL query over a time range.

# Get request rate over last hour
query_range(
    query="rate(http_requests_total[5m])",
    start="now-1h",
    end="now",
    step="30s"
)

# Get CPU usage by instance
query_range(
    query="avg(cpu_usage) by (instance)",
    start="2024-01-15T00:00:00Z",
    end="2024-01-15T12:00:00Z",
    step="1m"
)

query_exemplars

Query exemplars for trace correlation.

query_exemplars(
    query="http_request_duration_seconds_bucket",
    start="now-1h",
    end="now"
)

Metadata Discovery Tools

list_metrics

List all available metrics.

# List all metrics
list_metrics()

# List HTTP-related metrics
list_metrics(match="http_.*")

# List all counter metrics
list_metrics(match=".*_total")

get_metric_metadata

Get metadata (type, help text) for metrics.

# Get metadata for a specific metric
get_metric_metadata(metric="http_requests_total")

# Get metadata for all metrics (limited to 100)
get_metric_metadata(limit=100)

list_labels

List all label names.

# List all labels
list_labels()

# List labels for specific job
list_labels(match=['{job="api"}'])

get_label_values

Get all values for a specific label.

# Get all job names
get_label_values(label="job")

# Get namespaces in prod cluster
get_label_values(
    label="namespace",
    match=['{cluster="prod"}']
)

find_series

Find time series matching label selectors.

# Find all series for a job
find_series(match=['{job="api"}'])

# Find all HTTP metrics in production
find_series(
    match=['{__name__=~"http_.*",env="production"}'],
    start="now-1h",
    end="now"
)

Target & Scrape Tools

list_targets

List all scrape targets and their status.

# List all targets
list_targets()

# List only active targets
list_targets(state="active")

# List only dropped targets
list_targets(state="dropped")

get_targets_metadata

Get metadata about metrics from targets.

# Get metadata for specific target
get_targets_metadata(match_target='{job="api-server"}')

# Get metadata for specific metric
get_targets_metadata(metric="http_requests_total")

Alert Tools

list_alerts

List all active alerts (firing and pending).

list_alerts()

list_rules

List all recording and alerting rules.

# List all rules
list_rules()

# List only alerting rules
list_rules(type="alert")

# List only recording rules
list_rules(type="record")

Configuration & Status Tools

get_config

Get the current Prometheus configuration.

get_config()

get_flags

Get Prometheus runtime flags.

get_flags()

get_runtime_info

Get Prometheus runtime information.

get_runtime_info()

get_tsdb_stats

Get TSDB statistics and cardinality.

get_tsdb_stats()

check_health

Check Prometheus health status.

check_health()

check_readiness

Check if Prometheus is ready to serve queries.

check_readiness()

PromQL Query Examples

Basic Queries

# Check target health
up

# Filter by job
up{job="api-server"}

# Get metric value
http_requests_total

# Multiple label filters
http_requests_total{job="api",status="200"}

Rate Queries

# Request rate over 5 minutes
rate(http_requests_total[5m])

# Sum rate by status code
sum(rate(http_requests_total[5m])) by (status)

# Instant rate (more sensitive to spikes)
irate(cpu_seconds_total[1m])

Aggregation

# Average CPU by instance
avg(cpu_usage) by (instance)

# Total memory by namespace
sum(memory_usage) by (namespace)

# Maximum response time by endpoint
max(response_time) by (endpoint)

# Count of targets by job
count(up) by (job)

Advanced Queries

# 95th percentile response time
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Predict disk usage in 1 hour
predict_linear(disk_usage[1h], 3600)

# Temperature change over 5 minutes
delta(cpu_temp[5m])

# Detect rate of increase
deriv(cpu_temp[5m])

Time Range Queries

# CPU usage increase over last hour vs previous hour
(avg_over_time(cpu_usage[1h]) - avg_over_time(cpu_usage[1h] offset 1h))

# Compare to last week
http_requests_total - http_requests_total offset 1w

Authentication

Bearer Token Authentication

export PROM_URL="https://prometheus.example.com"
export PROM_TOKEN="your_bearer_token"
prometheus-mcp-server

Basic Authentication

export PROM_URL="https://prometheus.example.com"
export PROM_USERNAME="admin"
export PROM_PASSWORD="secret"
prometheus-mcp-server

Kubernetes Service Account Token

# Get token from Kubernetes
TOKEN=$(kubectl get secret -n monitoring prometheus-token -o jsonpath='{.data.token}' | base64 -d)

export PROM_URL="https://prometheus.monitoring.svc.cluster.local:9090"
export PROM_TOKEN="$TOKEN"
prometheus-mcp-server

Configuration

Environment Variables

Variable Description Default Required
PROM_URL Prometheus server URL http://localhost:9090 No
PROMETHEUS_URL Alternative to PROM_URL - No
PROM_TOKEN Bearer token for auth - No
PROMETHEUS_TOKEN Alternative to PROM_TOKEN - No
PROM_USERNAME Username for basic auth - No
PROM_PASSWORD Password for basic auth - No
PROM_TIMEOUT Request timeout (seconds) 30 No
PROM_VERIFY_SSL Verify SSL certificates true No

Command-Line Arguments

prometheus-mcp-server \
  --url https://prometheus.example.com \
  --token your_bearer_token \
  --timeout 60 \
  --transport http \
  --port 8000

Full options:

--url URL                  Prometheus server URL
--token TOKEN              Bearer token for authentication
--username USERNAME        Username for basic auth
--password PASSWORD        Password for basic auth
--timeout SECONDS          Request timeout in seconds (default: 30)
--no-verify-ssl            Disable SSL verification (not recommended)
--transport {stdio,http,sse}  Transport mechanism (default: stdio)
--host HOST                Host for HTTP/SSE transport (default: 127.0.0.1)
--port PORT                Port for HTTP/SSE transport (default: 8000)

Use Cases

Incident Investigation

Ask questions like:

  • "What's the current CPU usage across all pods?"
  • "Show me the error rate for the API service in the last hour"
  • "Which targets are down right now?"
  • "What alerts are currently firing?"

Performance Analysis

  • "Compare request latency between now and 24 hours ago"
  • "Show me the top 10 endpoints by request volume"
  • "What's the memory usage trend for the worker pods?"

Capacity Planning

  • "What's the 95th percentile response time over the last week?"
  • "Show me the disk usage growth rate"
  • "Which services have the highest cardinality?"

Alert Analysis

  • "Why is the HighMemoryUsage alert firing?"
  • "Show me the history of the DiskSpaceLow alert"
  • "What's the current state of all alerting rules?"

Architecture

Project Structure

prometheus-mcp-server/
├── src/
│   └── prometheus_mcp_server/
│       ├── __init__.py          # Package initialization
│       ├── __main__.py          # CLI entry point
│       ├── server.py            # FastMCP server setup
│       ├── tools/
│       │   ├── __init__.py
│       │   └── registry.py      # All tool implementations
│       └── utils/
│           ├── __init__.py
│           ├── client.py        # Prometheus HTTP client
│           └── helpers.py       # Time parsing, formatting
├── pyproject.toml               # Project configuration
├── README.md                    # This file
└── USER_STORIES.md              # Detailed requirements

Technology Stack

  • MCP Framework: FastMCP 2.0
  • HTTP Client: httpx (async support)
  • Python: 3.12+ with full type hints
  • Authentication: Bearer token and Basic auth support
  • Read-Only: No write operations allowed

Development

Setup Development Environment

# Clone repository
git clone <repository-url>
cd prometheus-mcp-server

# Install with dev dependencies using uv
uv pip install -e ".[dev]"

# Or with pip
pip install -e ".[dev]"

Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=prometheus_mcp_server

# Run specific test file
pytest tests/test_client.py

Code Quality

# Format code
ruff format .

# Lint code
ruff check .

# Fix linting issues
ruff check --fix .

# Type checking (if using mypy)
mypy src/

Testing with Local Prometheus

# Run Prometheus locally with Docker
docker run -d -p 9090:9090 prom/prometheus

# Test the MCP server
export PROM_URL="http://localhost:9090"
prometheus-mcp-server

Security

Read-Only Guarantee

This MCP server is designed to be strictly read-only:

  • Only GET requests and specific read-only POST endpoints are allowed
  • All write operations (PUT, DELETE, PATCH) are blocked
  • No configuration modifications possible
  • No data deletion or manipulation

Blocked Operations

The following operations are blocked by design:

  • Creating or deleting targets
  • Modifying alert rules
  • Changing Prometheus configuration
  • Deleting time series data
  • Administrative operations

Safe Endpoints Only

Only these endpoint patterns are allowed:

  • GET /api/v1/* - All read operations
  • POST /api/v1/query* - Query operations only
  • POST /api/v1/series - Series discovery only
  • POST /api/v1/labels - Label queries only
  • GET /-/healthy - Health checks
  • GET /-/ready - Readiness checks

Troubleshooting

Connection Issues

# Test Prometheus connectivity
curl -s http://localhost:9090/api/v1/query?query=up

# Check with authentication
curl -H "Authorization: Bearer $PROM_TOKEN" \
  https://prometheus.example.com/api/v1/query?query=up

SSL Certificate Issues

# Disable SSL verification (not recommended for production)
export PROM_VERIFY_SSL="false"
prometheus-mcp-server

# Or use command-line flag
prometheus-mcp-server --no-verify-ssl

Timeout Issues

# Increase timeout for slow queries
export PROM_TIMEOUT="60"
prometheus-mcp-server

# Or use command-line flag
prometheus-mcp-server --timeout 60

Debug Mode

Enable debug logging:

# Set log level
export LOG_LEVEL="DEBUG"
prometheus-mcp-server

Examples

Example Configuration Files

Claude Desktop Config (macOS)

~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "prometheus-local": {
      "command": "prometheus-mcp-server",
      "env": {
        "PROM_URL": "http://localhost:9090"
      }
    },
    "prometheus-prod": {
      "command": "prometheus-mcp-server",
      "env": {
        "PROM_URL": "https://prometheus.prod.example.com",
        "PROM_TOKEN": "prod_bearer_token",
        "PROM_TIMEOUT": "60"
      }
    }
  }
}

Linux Config

~/.config/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "prometheus": {
      "command": "/home/user/.local/bin/prometheus-mcp-server",
      "env": {
        "PROM_URL": "http://localhost:9090"
      }
    }
  }
}

Example Queries

Investigate High Memory Alert

User: "The HighMemoryUsage alert is firing. Help me investigate."

AI uses:
1. list_alerts() - See all active alerts
2. query_instant(query='container_memory_usage{job="api"}') - Check current usage
3. query_range(query='container_memory_usage{job="api"}', start="now-6h", end="now") - See trend
4. list_rules(type="alert") - Check alert threshold

Find Top CPU Consumers

User: "Which pods are using the most CPU?"

AI uses:
1. query_instant(query='topk(10, rate(container_cpu_usage_seconds_total[5m]))') - Top 10 CPU users
2. get_label_values(label="pod") - List all pods
3. query_range(...) - Check trend over time

Check Service Health

User: "Is the API service healthy?"

AI uses:
1. query_instant(query='up{job="api-server"}') - Check if targets are up
2. list_targets(state="active") - See all active targets
3. query_instant(query='rate(http_requests_total{job="api",status=~"5.."}[5m])') - Check error rate
4. list_alerts() - Check for any alerts

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure code passes ruff checks
  5. Submit a pull request

License

MIT License - see LICENSE file for details

Support

Related Projects

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msrashed_prometheus_mcp_server-0.1.0.tar.gz (96.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

msrashed_prometheus_mcp_server-0.1.0-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file msrashed_prometheus_mcp_server-0.1.0.tar.gz.

File metadata

File hashes

Hashes for msrashed_prometheus_mcp_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 89d52cd8f15c52f616492286a92b8d89c593773d0c4c333f2d365e2aeb4b55be
MD5 717f170039a825b6afcbe06c7782ba13
BLAKE2b-256 0b2af2f3ada6721ccafca74f1255440e0de8627ee9aad73c56b3beae4b58543d

See more details on using hashes here.

File details

Details for the file msrashed_prometheus_mcp_server-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for msrashed_prometheus_mcp_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b434bd7c37c3b4e236aab62659b5dfb062bd343f59e4108b79fcebfcc42dcf9
MD5 e4a1dc821ef4218efd767061eb80160b
BLAKE2b-256 2ac9caea0b4d46a84a11146e0f58ea0f42742efcd18825a62f9ced4a2f8e2c2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page