Kubernetes deployment intelligence — typed query tools + AI-synthesised risk briefs. CLI + MCP server.
Project description
mcp-deploy-intel
Kubernetes deployment intelligence — typed query tools + AI-synthesised risk briefs. CLI + MCP server.
Record every rollout, score outcomes automatically, and get LLM-authored risk briefs before you promote to production. Works as a standalone CLI, a Model Context Protocol (MCP) server for Claude Desktop / Claude Code / any MCP client, a Docker image, a Helm chart (in-cluster watcher + evaluator Deployment), and a GitHub Action (pre-promote gate).
- Status:
v1.0.0— production-ready. - PyPI:
mcp-deploy-intel - Image:
ghcr.io/vellankikoti/mcp-deploy-intel:v1.0.0(multi-arch, cosign keyless-signed) - Source: this repo
- License: Apache-2.0
Table of contents
- What it does — the 9 tools
- Install
- Quickstart — one-liners
- CLI reference
- Watcher + evaluator lifecycle
- MCP server — Claude Desktop / Claude Code setup
- LLM configuration
- GitHub Action — pre-promote gate
- Helm chart — in-cluster Deployment
- Verify the container signature (cosign)
- Exit codes
- Troubleshooting
- Design & plans
- Development
What it does — the 9 tools
All tools are available via CLI, MCP, and the composite GitHub Action.
| # | Tool | Category | Needs | What it returns |
|---|---|---|---|---|
| 1 | list_workloads_in_namespace |
K8s query | k8s | Inventory table of all Deployments + StatefulSets in a namespace. |
| 2 | get_workload |
K8s query | k8s | Full snapshot: container specs, resource requests, probes, events. |
| 3 | get_metric_trend |
Prometheus | k8s + prom | TimeSeries for any PromQL expression over a sliding window. |
| 4 | get_inbound_traffic |
Prometheus | k8s + prom | Upstream callers inferred from Prometheus topology. |
| 5 | get_outbound_traffic |
Prometheus | k8s + prom | Downstream callees inferred from Prometheus topology. |
| 6 | record_deploy |
Deploy history | k8s + sqlite | Captures a K8s pre-snapshot and inserts a status=in_progress deploy record. |
| 7 | get_deploy_history |
Deploy history | sqlite | Most-recent-first deploy records for a workload. |
| 8 | get_rollback_history |
Deploy history | sqlite | Deploy records that ended in ROLLED_BACK status. |
| 9 | generate_risk_brief |
AI synthesis | k8s + sqlite + llm | LLM-authored structured RiskBrief with overall_risk, reasons, dependency_risks, historical_signals, recommendations; deterministic offline fallback. |
Run deploy-intel --help to see all commands. All 9 tools are also exposed via the MCP server.
Install
Pick the one that matches how you want to use it. No setup step is required beyond installing the tool itself.
A. Run ephemerally via uvx (recommended for one-off queries)
# Install uv if you don't have it (macOS/Linux):
curl -LsSf https://astral.sh/uv/install.sh | sh
# Then run the tool without installing:
uvx mcp-deploy-intel list-workloads-in-namespace default
uvx downloads the package and runs it in an isolated env. Nothing sticks to your global Python.
B. Install with pip / uv pip (if you want it on PATH)
pip install mcp-deploy-intel # system/user pip
# or
uv pip install mcp-deploy-intel # uv's pip (faster)
Then: deploy-intel --help.
C. Docker image (cosign-signed, multi-arch)
docker pull ghcr.io/vellankikoti/mcp-deploy-intel:v1.0.0
docker run --rm \
-v ~/.kube/config:/home/di/.kube/config:ro \
-e KUBECONFIG=/home/di/.kube/config \
ghcr.io/vellankikoti/mcp-deploy-intel:v1.0.0 \
list-workloads-in-namespace default
macOS + kind note: the container's
127.0.0.1doesn't point to the kind API server on the host. Use a remote cluster or use the CLI directly (uvx/pip) against local kind.
D. Helm chart (in-cluster watcher + evaluator Deployment)
See Helm chart — in-cluster Deployment below.
E. GitHub Action (pre-promote gate)
See GitHub Action — pre-promote gate below.
Quickstart — one-liners
Point kubectl at your cluster first (any cluster: kind, EKS, GKE, AKS, bare-metal). deploy-intel uses your current kubeconfig context unless you pass --context or --kubeconfig.
# 1. List all workloads in a namespace
uvx mcp-deploy-intel list-workloads-in-namespace default
# 2. Snapshot a single workload in Markdown
uvx mcp-deploy-intel get-workload default/Deployment/my-api --format md
# 3. Start the watcher — records every new rollout to SQLite
deploy-intel watch --namespace prod --db-path /tmp/deploy.db
# 4. Run the evaluator — scores in_progress deploys every 60 s
deploy-intel evaluate --db-path /tmp/deploy.db --no-llm
# 5. Get the deploy history for a workload
deploy-intel get-deploy-history default Deployment my-api --db-path /tmp/deploy.db
# 6. Generate an AI risk brief before promoting
deploy-intel risk-brief default/Deployment/my-api \
--target-image my-api:v2.3.1 \
--db-path /tmp/deploy.db \
--format md
CLI reference
All commands share these global behaviours:
- Default kubeconfig:
$KUBECONFIGenv var, falling back to~/.kube/config. - Default context: whatever
kubectl config current-contextwould return. - Exit codes: see Exit codes.
list-workloads-in-namespace — namespace inventory
deploy-intel list-workloads-in-namespace <namespace> [--context TEXT] [--kubeconfig PATH]
get-workload — single workload snapshot
deploy-intel get-workload <namespace>/<kind>/<name> [--context TEXT] [-f md|json]
watch — K8s informer loop
deploy-intel watch [--namespace NS]... [--context TEXT] [--kubeconfig PATH] [--db-path PATH]
Streams Deployment and StatefulSet events. Deduplicates by (uid, generation). Inserts one DeployRecord per new rollout. Uses exponential backoff (1–60 s) on reconnect; honours SIGINT/SIGTERM.
evaluate — scoring loop
deploy-intel evaluate [--window-s N] [--interval-s N] [--prom-url URL] [--db-path PATH] [--no-llm]
Picks up in_progress deploys older than window-s seconds, fetches a post-snapshot, optionally queries Prometheus error rates, and writes the final status back to SQLite:
| Status | Condition |
|---|---|
ROLLED_BACK |
current generation < recorded revision, or workload is gone. |
DEGRADED |
Prom error rate doubled above threshold, or replicas_ready < replicas_desired. |
SUCCESS |
replicas stable, no error-rate regression. |
UNKNOWN |
no Prom and no replica signal available. |
record-deploy — manual deploy capture
deploy-intel record-deploy <namespace> <kind> <name> <revision> <image> [--actor TEXT] [--commit-sha TEXT] [--db-path PATH]
get-deploy-history — query history
deploy-intel get-deploy-history <namespace> <kind> <name> [--limit N] [--db-path PATH]
get-rollback-history — query rollbacks
deploy-intel get-rollback-history <namespace> <kind> <name> [--limit N] [--db-path PATH]
risk-brief — AI risk assessment
deploy-intel risk-brief <namespace>/<kind>/<name> --target-image <image>
[--prom-url URL] [--llm-provider TEXT] [--llm-base-url TEXT] [--llm-api-key TEXT]
[--no-llm] [--db-path PATH] [-f md|json]
Fans out in parallel to get_workload, get_metric_trend, get_inbound_traffic, get_deploy_history, and get_rollback_history; assembles evidence; calls the LLM for a structured RiskBrief; falls back to rule-based scoring when offline.
serve-mcp — run the MCP stdio server
deploy-intel serve-mcp
Intended to be wired into an MCP client (Claude Desktop, Claude Code, any MCP-compatible tool). Exposes all 9 tools.
Watcher + evaluator lifecycle
The watcher and evaluator are designed to run as a pair — either locally in two terminals, or as a two-container Deployment in cluster.
┌──────────────┐
K8s events ──► │ watcher │ ──► SQLite (history.db)
└──────────────┘ │
│ (shared PVC / shared path)
┌──────────────┐ │
every 60 s ──► │ evaluator │ ◄────────┘
└──────────────┘
│
▼
ROLLED_BACK / DEGRADED / SUCCESS / UNKNOWN
Both containers run as UID 10001, read-only root filesystem, with no extra Linux capabilities.
MCP server — Claude Desktop / Claude Code setup
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"deploy-intel": {
"command": "uvx",
"args": ["mcp-deploy-intel", "serve-mcp"],
"env": {
"DEPLOY_INTEL_LLM_PROVIDER": "anthropic/claude-sonnet-4-6"
}
}
}
}
Restart Claude Desktop. In a conversation, these tools become available:
list_workloads_in_namespace(namespace)— namespace inventory.get_workload(namespace, kind, name)— single workload snapshot.get_metric_trend(promql, window)— TimeSeries for any PromQL query.get_inbound_traffic(namespace, kind, name)— upstream callers.get_outbound_traffic(namespace, kind, name)— downstream callees.record_deploy(namespace, kind, name, revision, image, ...)— manual capture.get_deploy_history(namespace, kind, name)— full history.get_rollback_history(namespace, kind, name)— rollback subset.generate_risk_brief(namespace, kind, name, target_image)— AI risk brief.
Example prompt: "Give me a risk brief for the checkout Deployment in the prod namespace before I promote to v2.3.1."
Claude Code
Add to the MCP servers section of your Claude Code config (~/.config/claude-code/mcp.json or equivalent):
{
"mcpServers": {
"deploy-intel": {
"command": "uvx",
"args": ["mcp-deploy-intel", "serve-mcp"]
}
}
}
Any other MCP client
The server speaks MCP over stdio. Invoke deploy-intel serve-mcp and send JSON-RPC on stdin. All 9 tools are exposed.
LLM configuration
mcp-deploy-intel is LLM-agnostic via litellm + instructor. Works with any provider, any URL, or fully offline.
Precedence (highest wins): CLI flag → environment variable → default.
| What | CLI flag | Env var |
|---|---|---|
| Provider | --llm-provider |
DEPLOY_INTEL_LLM_PROVIDER |
| Base URL | --llm-base-url |
DEPLOY_INTEL_LLM_BASE_URL |
| API key | --llm-api-key |
DEPLOY_INTEL_LLM_API_KEY |
| Offline | --no-llm |
DEPLOY_INTEL_OFFLINE=1 |
Anthropic (Claude)
export DEPLOY_INTEL_LLM_PROVIDER=anthropic/claude-sonnet-4-6
export DEPLOY_INTEL_LLM_API_KEY=sk-ant-...
deploy-intel risk-brief prod/Deployment/api --target-image api:v2
OpenAI
export DEPLOY_INTEL_LLM_PROVIDER=openai/gpt-4o
export DEPLOY_INTEL_LLM_API_KEY=sk-...
Local Ollama
ollama serve &
ollama pull qwen2.5:7b
export DEPLOY_INTEL_LLM_PROVIDER=ollama/qwen2.5:7b
export DEPLOY_INTEL_LLM_BASE_URL=http://localhost:11434
Fully offline (no LLM calls)
deploy-intel risk-brief ... --no-llm
# or
export DEPLOY_INTEL_OFFLINE=1
Offline mode uses deterministic rule-based scoring derived from deploy history, replica counts, and Prometheus error rates.
GitHub Action — pre-promote gate
The repo ships a composite GitHub Action. Use it as a pre-promote check in your release workflow.
# .github/workflows/pre-promote.yml
name: pre-promote
on:
workflow_dispatch:
inputs:
image:
required: true
target-ref:
required: true
jobs:
risk-brief:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure kubectl
run: |
mkdir -p ~/.kube
echo "${{ secrets.KUBECONFIG }}" > ~/.kube/config
chmod 600 ~/.kube/config
- uses: vellankikoti/mcp-deploy-intel@v1.0.0
with:
namespace: prod
workload: Deployment/api
target-image: ${{ github.event.inputs.image }}
fail-on: blocker
offline: "true"
- name: Upload risk brief
if: always()
uses: actions/upload-artifact@v4
with:
name: risk-brief
path: risk-brief.md
Inputs (see action.yml):
namespace, workload, target-image, kubeconfig, context, prom-url, fail-on, llm-provider, llm-api-key, offline, db-path, format, image.
Outputs:
report-path: absolute path to the risk brief file.overall-risk:low/medium/high/blocker/unknown.
Helm chart — in-cluster Deployment
Install a two-container Deployment (watcher + evaluator) that records every rollout and scores outcomes continuously.
git clone https://github.com/vellankikoti/mcp-deploy-intel.git
cd mcp-deploy-intel
helm install deploy-intel ./charts/deploy-intel \
--namespace deploy-intel --create-namespace \
--set image.tag=v1.0.0
Key values (see charts/deploy-intel/values.yaml for full list):
| Key | Default | Purpose |
|---|---|---|
image.tag |
"v1.0.0" |
Image tag to deploy. |
watcher.namespaces |
[] |
Empty = watch all namespaces. |
evaluator.windowSeconds |
600 |
Seconds to wait before scoring a deploy. |
evaluator.intervalSeconds |
60 |
Scoring loop cadence. |
evaluator.promUrl |
"" |
Prometheus URL for error-rate scoring. |
persistence.enabled |
true |
Mount a PVC for the SQLite database. |
persistence.size |
1Gi |
PVC size. |
llm.offline |
true |
Default safe; flip to enable LLM risk briefs. |
llm.provider |
"" |
e.g. anthropic/claude-sonnet-4-6. |
llm.secretName |
"" |
Name of a Secret holding the LLM API key. |
resources |
100m/128Mi req, 256Mi mem lim |
Applied to both containers. |
rbac.create |
true |
Creates cluster-scoped read-only RBAC. |
Query the in-cluster history:
# Exec into the watcher container:
kubectl exec -n deploy-intel deploy/deploy-intel -c watcher -- \
deploy-intel get-deploy-history <namespace> <kind> <name> \
--db-path /var/lib/deploy-intel/history.db
Enable LLM risk briefs in-cluster:
kubectl -n deploy-intel create secret generic llm-api \
--from-literal=api-key=sk-ant-...
helm upgrade deploy-intel ./charts/deploy-intel \
--namespace deploy-intel \
--set llm.offline=false \
--set llm.provider=anthropic/claude-sonnet-4-6 \
--set llm.secretName=llm-api
Verify the container signature (cosign)
The image is signed with keyless OIDC via Sigstore. Anyone can verify:
# Install cosign: https://docs.sigstore.dev/cosign/system_config/installation/
brew install cosign # or the equivalent
cosign verify ghcr.io/vellankikoti/mcp-deploy-intel:v1.0.0 \
--certificate-identity-regexp="https://github.com/vellankikoti/mcp-deploy-intel/.github/workflows/release.yml@refs/tags/v1.0.0" \
--certificate-oidc-issuer="https://token.actions.githubusercontent.com"
Expected: Verification for ... — The following checks were performed: ....
A CycloneDX SBOM is attached to every GitHub Release as an asset (sbom-vX.Y.Z.cdx.json).
Exit codes
| Code | Meaning |
|---|---|
| 0 | Clean: no error; query or watch completed normally. |
| 1 | Risk brief returned blocker severity (with --fail-on blocker). |
| 2 | Tool error (bad flag, unreachable API, parse failure). |
Troubleshooting
command not found: deploy-intel after pip install — the binary is installed into your venv's bin/. Activate the venv, or use uvx instead.
kubeconfig not found — pass --kubeconfig /path/to/config explicitly, or export KUBECONFIG=/path/to/config. Clouds (EKS/GKE/AKS) need their auth-helper binary on $PATH (aws-cli, gcloud, kubelogin respectively). The Docker image bundles these.
Prometheus query failed: connection refused — --prom-url points somewhere unreachable. Without --prom-url, Prometheus-requiring tools return status="skip" with a clear reason.
generate_risk_brief returns overall_risk=unknown with no LLM — pass --no-llm to get the deterministic fallback. With --no-llm, the brief uses rule-based scoring from deploy history and replica counts.
LLM error: model not found — check DEPLOY_INTEL_LLM_PROVIDER is set to a valid litellm provider string (e.g. anthropic/claude-sonnet-4-6). For Ollama, confirm the model is pulled (ollama list).
Container can't reach my local kind cluster — kind binds its API server to 127.0.0.1:<port> on your host; inside Docker, 127.0.0.1 is the container, not the host. Use the CLI directly (uvx / pip) for local kind, or point at a remote cluster.
watcher pod stuck in Pending — check PVC binding: kubectl -n deploy-intel get pvc. If your cluster has no default StorageClass, set persistence.storageClass in values or persistence.enabled=false (uses emptyDir instead).
Design & plans
- Design spec:
docs/superpowers/specs/2026-04-21-mcp-deploy-intel-design.md - Implementation plans:
docs/superpowers/plans/— seven plans,v0.1.0walking-skeleton →v1.0.0production release. - Workshop path (1–3 hours, hands-on):
docs/workshop.md - Changelog:
CHANGELOG.md
Development
git clone https://github.com/vellankikoti/mcp-deploy-intel.git
cd mcp-deploy-intel
# Install uv (https://docs.astral.sh/uv/)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Set up env
uv venv
uv pip install -e ".[dev]"
# Run quality gates
uv run ruff check .
uv run ruff format --check .
uv run mypy src
uv run pytest -m "not integration and not golden and not mcp_contract" -v # unit (fast)
# Helm chart tests (needs helm on PATH)
uv run pytest tests/helm/ -v
# Action metadata tests
uv run pytest tests/action/ -v
# Workflow tests
uv run pytest tests/workflows/ -v
# Golden + integration tests against an ephemeral kind cluster (needs Docker)
uv run pytest -m golden -v
uv run pytest -m integration -v
# Helm lint
helm lint charts/deploy-intel
Adding a tool:
- Implement in
src/deploy_intel/tools/<tool_name>.py. - Register via
@mcp.tool()insrc/deploy_intel/server.py. - Add CLI subcommand in
src/deploy_intel/cli.py. - Add unit tests under
tests/unit/test_tool_<name>.py. - Update this README's tool table.
Built for a talk/workshop on safely giving AI agents real Kubernetes capabilities. The full story is in docs/superpowers/specs/.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_deploy_intel-1.0.0.tar.gz.
File metadata
- Download URL: mcp_deploy_intel-1.0.0.tar.gz
- Upload date:
- Size: 313.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e09a9193bfc8888861946eb8a3b00c76e327c03ca9fd7cc2a4ea128715f582f8
|
|
| MD5 |
82177b7467d02127104f16491562491c
|
|
| BLAKE2b-256 |
e875cd4bf802d2460cfebd047530b6059a889e1fc935edc5619c604556bdea8c
|
Provenance
The following attestation bundles were made for mcp_deploy_intel-1.0.0.tar.gz:
Publisher:
release.yml on vellankikoti/mcp-deploy-intel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_deploy_intel-1.0.0.tar.gz -
Subject digest:
e09a9193bfc8888861946eb8a3b00c76e327c03ca9fd7cc2a4ea128715f582f8 - Sigstore transparency entry: 1349687027
- Sigstore integration time:
-
Permalink:
vellankikoti/mcp-deploy-intel@f3b722dc6a26d09604c6ddfa527833ecd5a0fdb6 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vellankikoti
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f3b722dc6a26d09604c6ddfa527833ecd5a0fdb6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file mcp_deploy_intel-1.0.0-py3-none-any.whl.
File metadata
- Download URL: mcp_deploy_intel-1.0.0-py3-none-any.whl
- Upload date:
- Size: 37.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e131652eaf9a0e6319ea16627db874d3877b406c8a00e2e62326429493921d27
|
|
| MD5 |
b36d5d6637bacac082c0ef647d83047c
|
|
| BLAKE2b-256 |
da6cb19e920f9e37645fae172ab1555a5475b5fcb8adc3954fd8005be470e74e
|
Provenance
The following attestation bundles were made for mcp_deploy_intel-1.0.0-py3-none-any.whl:
Publisher:
release.yml on vellankikoti/mcp-deploy-intel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_deploy_intel-1.0.0-py3-none-any.whl -
Subject digest:
e131652eaf9a0e6319ea16627db874d3877b406c8a00e2e62326429493921d27 - Sigstore transparency entry: 1349687148
- Sigstore integration time:
-
Permalink:
vellankikoti/mcp-deploy-intel@f3b722dc6a26d09604c6ddfa527833ecd5a0fdb6 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/vellankikoti
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f3b722dc6a26d09604c6ddfa527833ecd5a0fdb6 -
Trigger Event:
push
-
Statement type: