Skip to main content

Intelligent infrastructure monitoring agent for EKS/K8s

Project description

Octantis

CI Build mcp-grafana GHCR Python 3.12+ License: AGPL-3.0 SDK License: Apache-2.0

Intelligent infrastructure monitoring agent for Kubernetes, Docker, and AWS. Receives metrics and logs via OTLP, uses an LLM to autonomously investigate and classify incidents, and notifies Slack/Discord with a concrete remediation plan.

List of Contents

How it works

OTel Collector ──OTLP──► Octantis ──MCP──► Grafana / K8s / Docker / AWS
                              │
                              ├── LLM (Anthropic / OpenRouter / Bedrock)
                              │
                              └──► Slack / Discord (remediation plan)
  1. Receive — OTLP metrics/logs from OpenTelemetry Collector (gRPC :4317, HTTP :4318)
  2. Filter — Drop health checks, benign patterns, and deduplicate via fingerprint cooldown
  3. Detect — Auto-detect source platform (K8s, Docker, AWS) from OTLP resource attributes
  4. Investigate — LLM autonomously queries Prometheus (PromQL), Loki (LogQL), and platform tools via MCP
  5. Analyze — Classify severity (CRITICAL / MODERATE / LOW / NOT_A_PROBLEM) with confidence score
  6. Plan — Generate actionable remediation steps
  7. Notify — Send to Slack and/or Discord (only if severity >= threshold)

Container Image

ghcr.io/vinny1892/octantis:latest

Published automatically by CI on every push to master. Pin to a specific commit SHA for production (e.g., ghcr.io/vinny1892/octantis:dba131d).

Running Octantis

Local Kind Cluster (quickstart)

The fastest way to try Octantis. The dev/ directory contains scripts that create a Kind cluster with a full observability stack (Prometheus, Grafana, Mimir, OTel Collector, MetalLB, MCP servers, and Octantis itself) — everything needed to run end-to-end locally.

# Prerequisites: Docker, Kind, kubectl, Helm

# 1. Configure secrets
export OPENROUTER_API_KEY="sk-or-..."
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
export DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/..."

# 2. Create the cluster
bash dev/setup.sh

See dev/README.md for full details (architecture, secrets, troubleshooting).

Existing Kubernetes Cluster

For deploying Octantis to a real cluster (EKS, GKE, AKS, etc.), use the example manifests:

# 1. Create secrets
kubectl create secret generic octantis-secrets \
  --namespace monitoring \
  --from-literal=ANTHROPIC_API_KEY=sk-ant-... \
  --from-literal=GRAFANA_MCP_API_KEY=glsa_...

# 2. Deploy MCP servers + Octantis
kubectl apply -f examples/kubernetes/

See examples/kubernetes/ for the manifests. Customize the image, model, and notification settings in the ConfigMap.

Image: ghcr.io/vinny1892/octantis:latest

From Source

To run Octantis outside a cluster (requires Python 3.12+ and uv):

uv sync
cp .env.example .env   # edit with your keys
uv run octantis

See Onboarding — Local Development for full setup details.

Configuration

All settings via environment variables. See .env.example for the full list.

Key settings:

Variable Default Description
LLM_PROVIDER anthropic anthropic, openrouter, or bedrock
LLM_MODEL claude-sonnet-4-6 Model ID (e.g., anthropic/claude-sonnet-4-6 for OpenRouter, global.anthropic.claude-opus-4-6-v1 for Bedrock)
GRAFANA_MCP_URL Grafana MCP SSE endpoint (observability slot)
K8S_MCP_URL Kubernetes MCP SSE endpoint (platform slot)
DOCKER_MCP_URL Docker MCP SSE endpoint (platform slot)
AWS_MCP_URL AWS MCP SSE endpoint (platform slot)
OCTANTIS_PLATFORM (auto) Force platform: k8s, docker, or aws
MIN_SEVERITY_TO_NOTIFY MODERATE Minimum severity to send alerts
LANGUAGE en Output language (en, pt-br)
SLACK_WEBHOOK_URL Slack notifications (empty = disabled)
DISCORD_WEBHOOK_URL Discord notifications (empty = disabled)

MCP Servers

Octantis connects to MCP servers via SSE using a slot model (max 1 observability + 1 platform):

Server Slot Image Purpose
Grafana MCP observability ghcr.io/vinny1892/mcp-grafana:latest PromQL, LogQL, dashboard queries
Kubernetes MCP platform ghcr.io/containers/kubernetes-mcp-server:latest Pod status, events, deployments, logs
Docker MCP platform (community/custom) Container inspection, logs, resource stats
AWS MCP platform (community/custom) EC2 status, CloudWatch metrics, ECS tasks

Platform is auto-detected from OTLP resource attributes (K8s → Docker → AWS). Override with OCTANTIS_PLATFORM.

Severity Levels

Level Meaning Action
CRITICAL Service down / data loss risk Notify + Action Plan
MODERATE Degraded / trending bad Notify + Action Plan
LOW Minor anomaly Log only
NOT_A_PROBLEM Expected / false positive Log only

Contributing

See CONTRIBUTING.md.

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

octantis-0.1.0.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

octantis-0.1.0-py3-none-any.whl (80.2 kB view details)

Uploaded Python 3

File details

Details for the file octantis-0.1.0.tar.gz.

File metadata

  • Download URL: octantis-0.1.0.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for octantis-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5583152db1e92e9c41cddaa36daca22e407319e4b0008fcd3b7c6a1a4c6fc713
MD5 b8d7d2bb313daa361b145ea1af3785d0
BLAKE2b-256 a94d77a807efc4f7b2d4b7b6ace2e267fdc6ec0942e092d83f67ceba859e94c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for octantis-0.1.0.tar.gz:

Publisher: release.yml on Vinny1892/octantis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file octantis-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: octantis-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 80.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for octantis-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 570a60e167927c054821e60bc341db6a21d7bae01f35216dcff3aabfeadba41c
MD5 f9dd76a9816a4e919e437403147b89b5
BLAKE2b-256 1f028b154620586367c1cabed695139625c7c1c67933cefad38f783d436b8d19

See more details on using hashes here.

Provenance

The following attestation bundles were made for octantis-0.1.0-py3-none-any.whl:

Publisher: release.yml on Vinny1892/octantis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page