A Model Context Protocol (MCP) server for Prometheus Alertmanager alert management and incident response.
Project description
Alertmanager MCP Server
An MCP server that gives AI assistants the power to triage alerts, manage silences, inspect routing, and govern Alertmanager operations — from on-call summarization to notification pipeline testing.
Quick Start · Docs · Report Bug · Request Feature
Why Alertmanager MCP Server?
The problem: Alertmanager is the notification brain of the Prometheus ecosystem, but operating it effectively requires deep knowledge. Understanding the routing tree to know who gets paged, creating silences with the right matchers and durations, auditing why an alert didn't reach the right receiver, and managing maintenance windows — each of these requires familiarity with Alertmanager's configuration model. If you ask an AI assistant to help, it typically guesses at matcher syntax, creates overly broad silences, or can't explain the routing logic.
The solution: The Alertmanager MCP Server gives AI assistants (like Claude, Cline, or Cursor) structured, safe tools to operate Alertmanager natively. Instead of guessing at matchers or writing silence payloads from memory, your AI can now confidently manage the entire alert lifecycle:
- On-Call Triage: The AI summarizes active alerts grouped by severity and service, explains routing paths, and identifies alerts falling into the default route — all in one guided workflow.
- Safe Silence Management: Mandatory preview dry-runs before creating silences, duplicate detection, 24-hour duration caps, blast-radius warnings, and policy validation — preventing overly broad silences that could mask real incidents.
- Routing Introspection: Simulate routing for any label set (
amtool config routes test-equivalent), inspect the full routing tree, list receivers with integration types, and audit which alerts hit the default route. - Governance & Compliance: Export effective configuration for Git storage, audit recent silence changes with author tracking, and validate proposed silences against organizational policy.
- Multi-Backend Support: Manage multiple Alertmanager backends with explicit
backend_idon every call — no hidden defaults.
Key Features
Backend Discovery & Multi-Backend
- Discover and inspect multiple Alertmanager backends
- Health checks, version info, cluster peer status
- Supports standalone and clustered Alertmanager deployments
Alert Triage & On-Call
- List and filter alerts by label, severity, state, and receiver
- Alert group inspection (Alertmanager's native grouping)
- Human-readable on-call summaries with severity/service breakdowns
- Push test alerts to verify notification integrations
Silence Lifecycle Management
- Full CRUD: create, update (extend), expire silences
- Mandatory preview dry-run before broad silences
- Duplicate silence detection — blocks creating equivalent active silences
- 24-hour duration cap (configurable)
- LLM-friendly
silence_alerthelper with scope control (instance/service/env) - Policy validation for compliance checks
Routing & Notification Introspection
- Full nested routing tree inspection
- Receiver enumeration with integration type detection (Slack, PagerDuty, email, webhook)
- Route simulation for any label set with human-readable explanations
- Default route audit — identifies misconfigured alerts
Governance & Audit
- Export effective configuration as YAML or JSON
- Track recent silence lifecycle changes with author attribution
- In-memory audit log for all MCP-initiated operations
- Silence policy validation (duration caps, comment requirements, blast radius)
Production-Ready
- Structured logging (JSON/text)
- Environment-based configuration with multi-backend JSON support
Architecture
┌─────────────────────────┐
│ MCP Client │
│ (Claude, Cline, Cursor) │
└──────────┬──────────────┘
│
┌──────────▼──────────────┐
│ FastMCP Server Core │
│ (HTTP / SSE / stdio) │
└──────────┬──────────────┘
│
┌────────────┬───────────┼───────────┬────────────┐
│ │ │ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Tools │ │Resources│ │ Prompts │ │ Utils │ │ Models │
│ (6 grp) │ │ (11) │ │ (3) │ │ │ │ │
└────┬────┘ └────┬────┘ └─────────┘ └─────────┘ └─────────┘
│ │
└──────┬─────┘
│
┌──────────▼──────────┐
│ Service Layer │
│ │
│ alertmanager_service │
└──────────┬──────────┘
│
┌──────────▼──────────┐
│ Alertmanager HTTP API│
│ (v2 API) │
└─────────────────────┘
How it works:
- An AI assistant connects via HTTP, SSE, or stdio.
- The AI loads
am://system/backendsresource to discover available backends. - Every subsequent tool call requires an explicit
backend_id— no hidden state. - The service layer interacts with Alertmanager's v2 HTTP API.
- Safety guardrails enforce silence duration caps and blast-radius warnings.
Table of Contents
- Why Alertmanager MCP Server?
- Key Features
- Architecture
- Tech Stack
- Getting Started
- Configuration
- Available Tools
- Available Resources
- Available Prompts
- Usage
- Project Structure
- Roadmap
- Contributing
- FAQ
- Troubleshooting
- Security Considerations
- License
- Contact
- Acknowledgments
Tech Stack
| Category | Technologies |
|---|---|
| Language | Python 3.12+ |
| MCP Framework | FastMCP ≥2.13.3 |
| Protocol | Model Context Protocol (MCP) |
| Alertmanager | HTTP API v2 · Silence API · Route Simulation |
| Transport | HTTP · SSE · Streamable-HTTP · stdio |
| Infrastructure | Docker · uv |
Getting Started
Prerequisites
- Docker (recommended) or Python 3.12+ (for local dev)
- Access to an Alertmanager instance (standalone or clustered)
Quick Start with Docker (recommended)
docker run --rm -it \
-p 8768:8768 \
-e ALERTMANAGER_BASE_URL=http://host.docker.internal:9093 \
-e MCP_TRANSPORT=http \
talkopsai/alertmanager-mcp-server:latest
The server is now listening on http://localhost:8768/mcp.
Point your MCP client at it:
{
"mcpServers": {
"alertmanager": {
"url": "http://localhost:8768/mcp",
"description": "MCP Server for Alertmanager alert triage, silence management, and routing"
}
}
}
From Source (Python)
-
Install uv for dependency management.
-
Clone and set up:
git clone https://github.com/talkops-ai/talkops-mcp.git
cd talkops-mcp/src/alertmanager-mcp-server
uv venv
source .venv/bin/activate
uv pip install -e ".[dev]"
- Configure your
.env:
ALERTMANAGER_BASE_URL=http://localhost:9093
MCP_TRANSPORT=http
MCP_LOG_LEVEL=INFO
- Run the server:
uv run alertmanager-mcp-server
Or, with the venv activated: alertmanager-mcp-server.
- Run tests:
source .venv/bin/activate
pytest tests/
Configuration
All configuration is via environment variables (loaded from .env via python-dotenv).
Server Configuration
| Variable | Default | Description |
|---|---|---|
MCP_SERVER_NAME |
alertmanager-mcp-server |
Server name identifier |
MCP_SERVER_VERSION |
0.1.0 |
Server version string |
MCP_TRANSPORT |
stdio |
Transport mode: http, sse, streamable-http, or stdio |
MCP_HOST |
0.0.0.0 |
Host address for HTTP server |
MCP_PORT |
8768 |
Port for HTTP server |
MCP_PATH |
/mcp |
MCP endpoint path |
MCP_LOG_LEVEL |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR |
MCP_LOG_FORMAT |
json |
Log format: json or text |
Alertmanager Backend (Single)
| Variable | Default | Description |
|---|---|---|
ALERTMANAGER_BASE_URL |
http://localhost:9093 |
Alertmanager HTTP API base URL |
ALERTMANAGER_BACKEND_ID |
default |
Backend identifier used in all tool calls |
ALERTMANAGER_DISPLAY_NAME |
(empty) | Human-readable backend name |
ALERTMANAGER_AUTH_HEADER |
(empty) | Authorization header value (e.g. Bearer <token>) |
ALERTMANAGER_VERIFY_SSL |
true |
Verify SSL certificates |
ALERTMANAGER_TIMEOUT |
30 |
HTTP timeout for Alertmanager API calls (seconds) |
Alertmanager Backends (Multi)
For multiple backends, set ALERTMANAGER_BACKENDS as a JSON array:
ALERTMANAGER_BACKENDS='[
{"id": "prod", "base_url": "https://alertmanager-prod.example.com", "labels": {"env": "prod"}},
{"id": "staging", "base_url": "https://alertmanager-staging.example.com", "labels": {"env": "staging"}}
]'
Silence Safety
| Variable | Default | Description |
|---|---|---|
AM_MAX_SILENCE_MINUTES |
1440 |
Maximum silence duration in minutes (24h) |
AM_SILENCE_WARNING_THRESHOLD |
50 |
Warn if a silence would affect ≥ N alerts |
Available Tools
Alert Triage
| Tool | Description |
|---|---|
am_list_alerts |
List alerts with label/state filters and pagination. |
am_list_alert_groups |
List alert groups as computed by Alertmanager for high-level triage. |
am_push_test_alert |
Fire a synthetic test alert to verify notification integrations. |
Silence Lifecycle
| Tool | Description |
|---|---|
am_list_silences |
List silences with optional state filter and pagination. |
am_create_silence |
Create a silence to suppress matching alerts (with duplicate detection). |
am_update_silence |
Update an existing silence (extend duration or modify end time). |
am_expire_silence |
Expire a silence to reactivate alert notifications. |
Silence Helpers
| Tool | Description |
|---|---|
am_preview_silence |
Preview the blast radius of a silence before creating it. |
am_silence_alert |
Create a narrowly-scoped silence for a specific alert (fingerprint or labels). |
Routing & Notifications
| Tool | Description |
|---|---|
am_explain_routing |
Simulate routing and inhibition for a given label set with explanation. |
am_audit_default_route |
Show alerts falling into the default route, highlighting misconfigurations. |
Governance & Audit
| Tool | Description |
|---|---|
am_list_recent_changes |
List recent silence changes (created, expired, updated) within a time window. |
am_validate_silence_policy |
Validate a proposed silence against organizational policy before creation. |
On-Call Triage
| Tool | Description |
|---|---|
am_summarize_oncall |
Generate a human-readable on-call summary of active alerts. |
Available Resources
| Resource URI | Description |
|---|---|
am://system/backends |
All known backends with health status — use this as the first step in any workflow |
am://system/backends/{backend_id} |
Detailed status, version, cluster info, and health for a specific backend |
am://system/status |
Alertmanager version, uptime, cluster info, and config summary |
am://system/receivers |
Configured receivers (Slack, PagerDuty, email, webhook) with redacted config |
am://system/config |
Routing tree and inhibition rules (secrets redacted) |
am://system/audit-log |
Recent MCP-initiated operations (create/expire/extend silence, push test alert) |
am://alerts/active |
Bounded snapshot of active alerts for default backend |
am://alerts/groups |
Snapshot of alert groups as computed by Alertmanager |
am://silences/active |
Snapshot of active silences for default backend |
am://best-practices |
Alerting best practices |
am://onboarding-guide |
Alert onboarding guide |
Available Prompts
Guided workflow prompts that orchestrate multiple tools into step-by-step journeys:
| Prompt Name | Description | Parameters |
|---|---|---|
am-alert-triage-guided |
Guided workflow for triaging active alerts | backend_id, service, env |
am-maintenance-silence-guided |
Guided workflow for creating a maintenance silence | backend_id, service, env, duration |
am-integration-test-guided |
Guided workflow for testing notification integrations (Slack, PagerDuty) | backend_id, team, receiver |
Usage
Supported workflows with prompt examples and links to detailed guides:
| Workflow | Prompt Example | Documentation |
|---|---|---|
| On-Call Triage | "Summarize what's firing right now for the checkout service in prod." |
AM_TRIAGE_TEST_GUIDE.md |
| Maintenance Silence | "Silence alerts for the payments service in prod for 2 hours during deployment." |
AM_SILENCE_TEST_GUIDE.md |
| Routing Audit | "Who gets paged when a critical alert fires for the api-server?" |
AM_GOVERNANCE_TEST_GUIDE.md |
| Integration Testing | "Push a test alert to verify that the slack-sre receiver is working." |
AM_GOVERNANCE_TEST_GUIDE.md |
| Governance Review | "Show me all silence changes in the last 24 hours and who created them." |
AM_GOVERNANCE_TEST_GUIDE.md |
See WORKFLOW_JOURNEYS.md for the full workflow reference and PROMPT_REFERENCE.md for natural-language prompts.
Project Structure
alertmanager-mcp-server/
├── alertmanager_mcp_server/ # Main package
│ ├── tools/ # MCP Tools (6 active tool groups, 14 tools)
│ │ ├── alert_tools.py # Alert listing, grouping, test alerts
│ │ ├── silence_tools.py # Silence CRUD lifecycle
│ │ ├── helper_tools.py # Preview & quick silence helpers
│ │ ├── routing_tools.py # Routing simulation, default route audit
│ │ ├── governance_tools.py # Audit, policy validation
│ │ └── triage_tools.py # On-call summarization
│ ├── resources/ # MCP Resources (11 URIs)
│ │ ├── backend_resources.py # Backend health & capabilities
│ │ ├── alert_resources.py # Active alerts & groups
│ │ ├── silence_resources.py # Active silences
│ │ ├── config_resources.py # Receivers & routing config
│ │ ├── status_resources.py # Version, uptime, cluster info
│ │ ├── audit_resources.py # MCP operation audit log
│ │ └── static_resources.py # Best practices & onboarding guide
│ ├── prompts/ # MCP Prompts (3 guided workflows)
│ │ ├── triage_prompts.py # Alert triage workflow
│ │ ├── silence_prompts.py # Maintenance silence workflow
│ │ └── onboarding_prompts.py # Integration test workflow
│ ├── services/ # Business logic
│ │ └── alertmanager_service.py # Alertmanager HTTP API wrapper
│ ├── server/ # FastMCP server setup
│ │ ├── core.py # Server creation
│ │ └── bootstrap.py # Component initialization
│ ├── models/ # Pydantic data models
│ │ ├── alert.py # Alert & AlertMatcher
│ │ ├── silence.py # Silence & PostableSilence
│ │ ├── backend.py # BackendDescriptor
│ │ ├── config.py # ConfigSnapshot, RouteNode, Receiver
│ │ └── audit.py # AuditEntry
│ ├── utils/ # Helpers
│ │ ├── __init__.py # Matcher logic, silence window calc
│ │ └── audit.py # In-memory audit log
│ ├── static/ # Static documentation
│ │ ├── ALERTMANAGER_BEST_PRACTICES.md
│ │ ├── ALERTMANAGER_ONBOARDING_GUIDE.md
│ │ └── ALERTMANAGER_MCP_INSTRUCTIONS.md
│ ├── exceptions/ # Custom exception hierarchy
│ ├── config.py # Environment parsing
│ └── main.py # Entry point
├── tests/ # Test suites
├── docs/ # Documentation
├── pyproject.toml # Package definitions (Python 3.12)
└── README.md # This documentation
Roadmap
Shipped:
- Multi-backend discovery with health checks
- Alert listing with label/state filtering and pagination
- Alert group inspection (Alertmanager native grouping)
- Full silence lifecycle (create, update, expire) with safety guardrails
- Silence preview dry-run with blast-radius analysis
- Duplicate silence detection
- LLM-friendly silence_alert helper with scope control
- Full routing tree introspection
- Route simulation with human-readable explanations
- Receiver enumeration with integration type detection
- Default route audit for misconfiguration detection
- On-call alert summarization grouped by severity/service
- Silence policy validation (duration caps, comment requirements)
- Config export (YAML/JSON) for Git storage
- Silence change audit with author tracking
- Test alert injection for integration verification
- In-memory audit log for all MCP operations
- 3 guided workflow prompts (triage, silence, integration test)
Coming next:
- Prometheus MCP cross-integration for metric-level diagnostics
- AlertmanagerConfig CRD management for Prometheus Operator
- Silence templates for recurring maintenance windows
- Webhook receiver testing with response validation
- Multi-tenant silence policies with team-scoped permissions
See open issues for the full list of proposed features.
Contributing
Contributions are welcome. The process is straightforward:
- Fork the repo
- Create a branch (
git checkout -b feature/SilenceTemplates) - Make your changes and commit
- Push and open a PR
If you're considering something bigger, open an issue first so we can align on the approach.
FAQ
Which MCP clients work with this?
Any MCP-compatible client including Claude Desktop, Cline, Cursor, and custom clients. Connect viahttp://localhost:8768/mcp for HTTP transport, or configure stdio for direct process communication.
Does this modify my Alertmanager configuration?
Most tools are read-only. The exceptions are:am_create_silence/am_update_silence/am_expire_silence/am_silence_alert (create/expire silences), and am_push_test_alert (fires a real alert into Alertmanager). Governance and routing tools are strictly read-only — they inspect but never modify configuration.
Why does the server enforce silence duration caps?
Unbounded silences are a leading cause of missed incidents. The default 24-hour cap ensures silences are time-boxed. If a maintenance window needs to be extended, useam_update_silence to incrementally extend. Override the cap via AM_MAX_SILENCE_MINUTES.
Can I use this with a clustered Alertmanager?
Yes. PointALERTMANAGER_BASE_URL at any cluster member or a load balancer. The server uses the standard Alertmanager v2 API, which handles cluster replication internally.
How does it relate to the Prometheus MCP Server?
They are complementary. The Prometheus MCP Server handles metric querying, exporter deployment, and TSDB management. The Alertmanager MCP Server handles alert triage, silences, routing, and notification management. Use both together for full observability coverage.Troubleshooting
Backend Connection Issues
- Verify
ALERTMANAGER_BASE_URLpoints to a reachable Alertmanager instance. - Load the
am://system/backendsresource to check health status. - If using auth, verify
ALERTMANAGER_AUTH_HEADERis set correctly. - For SSL issues, try
ALERTMANAGER_VERIFY_SSL=false(development only).
Silence Creation Failures
- Duration cap exceeded: The default cap is 24 hours (1440 minutes). Increase
AM_MAX_SILENCE_MINUTESor use shorter durations. - Duplicate silence: An equivalent active silence already exists. Use
am_list_silencesto find it. - Missing matchers: At least one matcher is required. Use
am_preview_silencefirst to validate.
Routing Simulation Issues
- Empty routing tree: The Alertmanager instance may not have a configuration loaded. Check
am://system/config. - No receivers found: Verify Alertmanager has receivers configured in its
alertmanager.yml. - Unexpected routing: Use
am_explain_routingwith the specific alert labels to trace the routing path.
Security Considerations
- Never expose the MCP server to the public internet without proper authentication.
- Silences affect real alert notifications — always preview before creating silences in production.
- Test alerts fire real notifications —
am_push_test_alertwill trigger downstream integrations (Slack, PagerDuty, email). - Configuration export may contain sensitive routing rules — treat exported configs as confidential.
License
Apache 2.0 — see LICENSE.
Contact
TalkOps AI — github.com/talkops-ai
Project: github.com/talkops-ai/talkops-mcp
Discord: Join the community
Acknowledgments
- Model Context Protocol for enabling AI-native tool interfaces.
- FastMCP for the Python MCP server framework.
- Alertmanager for the alert notification engine.
- Prometheus for the foundational monitoring ecosystem.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file talkops_alertmanager_mcp_server-0.1.0.tar.gz.
File metadata
- Download URL: talkops_alertmanager_mcp_server-0.1.0.tar.gz
- Upload date:
- Size: 116.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b276007e3d71399c25af3fd8a9cb27676dfc60fe92566f388a535b9af876dc88
|
|
| MD5 |
ef7b4c9619824a8dd45660967569354a
|
|
| BLAKE2b-256 |
b34fcef5babd79900082b781a0f85920d70e3875c5f2ddaf8f18ace620334db1
|
Provenance
The following attestation bundles were made for talkops_alertmanager_mcp_server-0.1.0.tar.gz:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_alertmanager_mcp_server-0.1.0.tar.gz -
Subject digest:
b276007e3d71399c25af3fd8a9cb27676dfc60fe92566f388a535b9af876dc88 - Sigstore transparency entry: 1572329341
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@d6a0507e26b44989b78b497fbd704a8ff5407eb5 -
Branch / Tag:
refs/tags/alertmanager-mcp-server/v0.1.0 - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@d6a0507e26b44989b78b497fbd704a8ff5407eb5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file talkops_alertmanager_mcp_server-0.1.0-py3-none-any.whl.
File metadata
- Download URL: talkops_alertmanager_mcp_server-0.1.0-py3-none-any.whl
- Upload date:
- Size: 56.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
259b11437b7c0925dd82f99d9f298d2dc351bd01e2cc8f20c6b07b0a2a5a3a3b
|
|
| MD5 |
342aedc65ef1743498fb1039ff836958
|
|
| BLAKE2b-256 |
d6acf26358df2d552eec9921afe6b7f62b24e82487cc377ebf05d3c11cf9da22
|
Provenance
The following attestation bundles were made for talkops_alertmanager_mcp_server-0.1.0-py3-none-any.whl:
Publisher:
release-pypi.yml on talkops-ai/talkops-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
talkops_alertmanager_mcp_server-0.1.0-py3-none-any.whl -
Subject digest:
259b11437b7c0925dd82f99d9f298d2dc351bd01e2cc8f20c6b07b0a2a5a3a3b - Sigstore transparency entry: 1572329358
- Sigstore integration time:
-
Permalink:
talkops-ai/talkops-mcp@d6a0507e26b44989b78b497fbd704a8ff5407eb5 -
Branch / Tag:
refs/tags/alertmanager-mcp-server/v0.1.0 - Owner: https://github.com/talkops-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@d6a0507e26b44989b78b497fbd704a8ff5407eb5 -
Trigger Event:
push
-
Statement type: