Skip to main content

FastAPI AlertEngine: production-ready request monitoring with p95 latency, error rate, Slack alerts, and zero-config instrument() setup.

Project description

⚡ fastapi-alertengine

Production-ready FastAPI monitoring in one line.

No Prometheus. No Grafana. No dashboards required — but one is included.


🔥 164/164 tests passing 🏦 Derived from financial-grade infrastructure (AnchorFlow / Tofamba) 🤖 AI-agent friendly (works with Claude / Copilot / Cursor)Memory mode — runs without Redis at all


🚀 Quickstart (one line)

pip install fastapi-alertengine
from fastapi import FastAPI
from fastapi_alertengine import instrument

app = FastAPI()
instrument(app)   # set ALERTENGINE_REDIS_URL or run without Redis in memory mode

That’s it. Four endpoints are now live:

Endpoint Description
GET /health/alerts Current SLO status — ok / warning / critical
POST /alerts/evaluate Evaluate + enqueue for Slack delivery
GET /metrics/history Aggregated per-minute metrics from Redis
GET /metrics/ingestion Ingestion counters (enqueued / dropped)

📊 Observability Dashboard

A full Streamlit dashboard is included in dashboard/. Point it at any running instance.

pip install -r dashboard/requirements.txt
ALERTENGINE_BASE_URL=http://localhost:8000 streamlit run dashboard/app.py

What you get:

  • System status card, P95 latency, error rate, req/min, health score (0–100)
  • Time-series charts: requests/min, error rate %, latency (avg + max)
  • Endpoint performance table sorted by impact score (requests × avg latency)
  • Live alert panel with threshold reference
  • Ingestion health: enqueued/dropped counters, queue pressure indicator
  • Auto-refresh every 10 seconds

⚡ How it works

Request → middleware (enqueue only, ~0μs) → response returned immediately
              ↓
        deque (in-memory)
              ↓  every 50ms
        drain() → Redis Stream (batched pipeline)
              ↓
        evaluate() → GET /health/alerts → Dashboard

📊 Example output

{
  "status": "warning",
  "service_name": "payments-api",
  "instance_id": "pod-3",
  "metrics": {
    "overall_p95_ms": 843.2,
    "webhook_p95_ms": 910.4,
    "api_p95_ms":     720.1,
    "error_rate":     0.012,
    "anomaly_score":  0.84,
    "sample_size":    187
  },
  "alerts": [
    {
      "type": "latency_spike",
      "severity": "warning",
      "message": "P95 latency (843ms) exceeds threshold (1000ms)"
    }
  ],
  "timestamp": 1712756301
}

⚙️ Configuration

All settings via environment variables (prefix: ALERTENGINE_):

Env Var Default Description
ALERTENGINE_REDIS_URL redis://localhost:6379/0 Redis connection URL
ALERTENGINE_SERVICE_NAME default Service name on every metric
ALERTENGINE_INSTANCE_ID default Instance ID on every metric
ALERTENGINE_P95_WARNING_MS 1000 P95 warning threshold (ms)
ALERTENGINE_P95_CRITICAL_MS 3000 P95 critical threshold (ms)
ALERTENGINE_ERROR_RATE_WARNING_PCT 2.0 Error rate warning (%)
ALERTENGINE_ERROR_RATE_CRITICAL_PCT 5.0 Error rate critical (%)
ALERTENGINE_SLACK_WEBHOOK_URL None Slack webhook for alert delivery
ALERTENGINE_SLACK_RATE_LIMIT_SECONDS 10 Minimum seconds between Slack messages
ALERTENGINE_STREAM_MAXLEN 10000 Max Redis Stream entries

🧩 Manual wiring (full control)

from fastapi_alertengine import AlertEngine, AlertConfig

config = AlertConfig(service_name="payments-api", p95_critical_ms=500)
engine = AlertEngine(config)
engine.start(app)   # wires middleware, drain task, and all endpoints

✅ Production readiness

  • ✔️ 164/164 tests passing (no live Redis required)
  • ✔️ Memory mode — runs without Redis, metrics buffered in-process
  • ✔️ Fail-safe — Redis down never crashes requests or the drain loop
  • ✔️ Backpressure — queue capped at 10,000 events, drops oldest on overflow
  • ✔️ Batched writes — 100 events per Redis pipeline, 50ms drain interval
  • ✔️ Graceful shutdown — flushes all in-memory aggregates on app stop
  • ✔️ Slack delivery — rate-limited, non-blocking, survives HTTP errors
  • ✔️ Streamlit dashboard — dark-themed, auto-refresh, zero extra backend config

🚀 What’s coming (v1.5)

  • Per-endpoint breakdown in evaluate()
  • alertengine-server Docker image (multi-service ingest)
  • PagerDuty / OpsGenie routing

💬 Support

📧 anchorflow@outlook.com

🐙 github.com/Tandem-Media/fastapi-alertengine


🛡️ License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastapi_alertengine-1.2.0.tar.gz (27.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastapi_alertengine-1.2.0-py3-none-any.whl (30.0 kB view details)

Uploaded Python 3

File details

Details for the file fastapi_alertengine-1.2.0.tar.gz.

File metadata

  • Download URL: fastapi_alertengine-1.2.0.tar.gz
  • Upload date:
  • Size: 27.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for fastapi_alertengine-1.2.0.tar.gz
Algorithm Hash digest
SHA256 e03f888bf07167d63d4cc3d65dc78d6cd66fdb307ffc3aec21bd421ef2573ef4
MD5 8a8c6f6af7506fee2fd9cc06c8333681
BLAKE2b-256 c2128a5078e0539b8fdf11827d65c8af5e53abb7a2c690999b2ee9737799a33a

See more details on using hashes here.

File details

Details for the file fastapi_alertengine-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for fastapi_alertengine-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1645b55176a0028f65d58c5946804c6f379af959c3e1969ceb31965c015978a4
MD5 34d09f428935a20c6ebfc52a9cd390a1
BLAKE2b-256 0a7dcbe609a67f3c29b9ed6f1d78b1bf3290d2370538b6470a9faec79573bdfc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page