Intelligent infrastructure monitoring agent for EKS/K8s
Project description
Octantis
Intelligent infrastructure monitoring agent for Kubernetes, Docker, and AWS. Receives metrics and logs via OTLP, uses an LLM to autonomously investigate and classify incidents, and notifies Slack/Discord with a concrete remediation plan.
List of Contents
- How it works
- Container Image
- Running Octantis
- Configuration
- MCP Servers
- Severity Levels
- Contributing
- Documentation
How it works
OTel Collector ──OTLP──► Octantis ──MCP──► Grafana / K8s / Docker / AWS
│
├── LLM (Anthropic / OpenRouter / Bedrock)
│
└──► Slack / Discord (remediation plan)
- Receive — OTLP metrics/logs from OpenTelemetry Collector (gRPC :4317, HTTP :4318)
- Filter — Drop health checks, benign patterns, and deduplicate via fingerprint cooldown
- Detect — Auto-detect source platform (K8s, Docker, AWS) from OTLP resource attributes
- Investigate — LLM autonomously queries Prometheus (PromQL), Loki (LogQL), and platform tools via MCP
- Analyze — Classify severity (CRITICAL / MODERATE / LOW / NOT_A_PROBLEM) with confidence score
- Plan — Generate actionable remediation steps
- Notify — Send to Slack and/or Discord (only if severity >= threshold)
Container Image
ghcr.io/vinny1892/octantis:latest
Published automatically by CI on every push to master. Pin to a specific commit SHA for production (e.g., ghcr.io/vinny1892/octantis:dba131d).
Running Octantis
Local Kind Cluster (quickstart)
The fastest way to try Octantis. The dev/ directory contains scripts that create a Kind cluster with a full observability stack (Prometheus, Grafana, Mimir, OTel Collector, MetalLB, MCP servers, and Octantis itself) — everything needed to run end-to-end locally.
# Prerequisites: Docker, Kind, kubectl, Helm
# 1. Configure secrets
export OPENROUTER_API_KEY="sk-or-..."
export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/..."
export DISCORD_WEBHOOK_URL="https://discord.com/api/webhooks/..."
# 2. Create the cluster
bash dev/setup.sh
See dev/README.md for full details (architecture, secrets, troubleshooting).
Existing Kubernetes Cluster
For deploying Octantis to a real cluster (EKS, GKE, AKS, etc.), use the example manifests:
# 1. Create secrets
kubectl create secret generic octantis-secrets \
--namespace monitoring \
--from-literal=ANTHROPIC_API_KEY=sk-ant-... \
--from-literal=GRAFANA_MCP_API_KEY=glsa_...
# 2. Deploy MCP servers + Octantis
kubectl apply -f examples/kubernetes/
See examples/kubernetes/ for the manifests. Customize the image, model, and notification settings in the ConfigMap.
Image: ghcr.io/vinny1892/octantis:latest
From Source
To run Octantis outside a cluster (requires Python 3.12+ and uv):
uv sync
cp .env.example .env # edit with your keys
uv run octantis
See Onboarding — Local Development for full setup details.
Configuration
All settings via environment variables. See .env.example for the full list.
Key settings:
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
anthropic |
anthropic, openrouter, or bedrock |
LLM_MODEL |
claude-sonnet-4-6 |
Model ID (e.g., anthropic/claude-sonnet-4-6 for OpenRouter, global.anthropic.claude-opus-4-6-v1 for Bedrock) |
GRAFANA_MCP_URL |
— | Grafana MCP SSE endpoint (observability slot) |
K8S_MCP_URL |
— | Kubernetes MCP SSE endpoint (platform slot) |
DOCKER_MCP_URL |
— | Docker MCP SSE endpoint (platform slot) |
AWS_MCP_URL |
— | AWS MCP SSE endpoint (platform slot) |
OCTANTIS_PLATFORM |
(auto) | Force platform: k8s, docker, or aws |
MIN_SEVERITY_TO_NOTIFY |
MODERATE |
Minimum severity to send alerts |
LANGUAGE |
en |
Output language (en, pt-br) |
SLACK_WEBHOOK_URL |
— | Slack notifications (empty = disabled) |
DISCORD_WEBHOOK_URL |
— | Discord notifications (empty = disabled) |
MCP Servers
Octantis connects to MCP servers via SSE using a slot model (max 1 observability + 1 platform):
| Server | Slot | Image | Purpose |
|---|---|---|---|
| Grafana MCP | observability | ghcr.io/vinny1892/mcp-grafana:latest |
PromQL, LogQL, dashboard queries |
| Kubernetes MCP | platform | ghcr.io/containers/kubernetes-mcp-server:latest |
Pod status, events, deployments, logs |
| Docker MCP | platform | (community/custom) | Container inspection, logs, resource stats |
| AWS MCP | platform | (community/custom) | EC2 status, CloudWatch metrics, ECS tasks |
Platform is auto-detected from OTLP resource attributes (K8s → Docker → AWS). Override with OCTANTIS_PLATFORM.
Severity Levels
| Level | Meaning | Action |
|---|---|---|
CRITICAL |
Service down / data loss risk | Notify + Action Plan |
MODERATE |
Degraded / trending bad | Notify + Action Plan |
LOW |
Minor anomaly | Log only |
NOT_A_PROBLEM |
Expected / false positive | Log only |
Contributing
See CONTRIBUTING.md.
Documentation
- Architecture Overview — data flow and design decisions
- Filter Pipeline — event ingestion and pre-filtering
- LangGraph Agent — investigation, analysis, planning, and notification
- Onboarding — setup guide and code map
- Licensing — dual-license model (AGPL-3.0 core, Apache-2.0 SDK), plan tiers, and AGPL FAQ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file octantis-0.1.0.tar.gz.
File metadata
- Download URL: octantis-0.1.0.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5583152db1e92e9c41cddaa36daca22e407319e4b0008fcd3b7c6a1a4c6fc713
|
|
| MD5 |
b8d7d2bb313daa361b145ea1af3785d0
|
|
| BLAKE2b-256 |
a94d77a807efc4f7b2d4b7b6ace2e267fdc6ec0942e092d83f67ceba859e94c2
|
Provenance
The following attestation bundles were made for octantis-0.1.0.tar.gz:
Publisher:
release.yml on Vinny1892/octantis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
octantis-0.1.0.tar.gz -
Subject digest:
5583152db1e92e9c41cddaa36daca22e407319e4b0008fcd3b7c6a1a4c6fc713 - Sigstore transparency entry: 1287898193
- Sigstore integration time:
-
Permalink:
Vinny1892/octantis@7a7fe5926b8da42c91b2bade1a4f64e4fae0fa25 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Vinny1892
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7a7fe5926b8da42c91b2bade1a4f64e4fae0fa25 -
Trigger Event:
push
-
Statement type:
File details
Details for the file octantis-0.1.0-py3-none-any.whl.
File metadata
- Download URL: octantis-0.1.0-py3-none-any.whl
- Upload date:
- Size: 80.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
570a60e167927c054821e60bc341db6a21d7bae01f35216dcff3aabfeadba41c
|
|
| MD5 |
f9dd76a9816a4e919e437403147b89b5
|
|
| BLAKE2b-256 |
1f028b154620586367c1cabed695139625c7c1c67933cefad38f783d436b8d19
|
Provenance
The following attestation bundles were made for octantis-0.1.0-py3-none-any.whl:
Publisher:
release.yml on Vinny1892/octantis
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
octantis-0.1.0-py3-none-any.whl -
Subject digest:
570a60e167927c054821e60bc341db6a21d7bae01f35216dcff3aabfeadba41c - Sigstore transparency entry: 1287898271
- Sigstore integration time:
-
Permalink:
Vinny1892/octantis@7a7fe5926b8da42c91b2bade1a4f64e4fae0fa25 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/Vinny1892
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7a7fe5926b8da42c91b2bade1a4f64e4fae0fa25 -
Trigger Event:
push
-
Statement type: