Skip to main content

Ship AI agents safely with release diffs, runtime evidence, and policy gates.

Project description

FlightDeck

Ship AI agents safely with release diffs, runtime evidence, and policy gates.

FlightDeck is local-first (CLI + SQLite + optional flightdeck serve UI). It is not an agent framework, prompt IDE, tracing dashboard, or gateway — it is where what shipped, what ran, what it cost, and whether promote is allowed are recorded and compared.

In ~20 seconds

  1. Register immutable agent releases (release.yaml + bundle checksum).
  2. Ingest run evidence (RunEvent JSONL or POST /v1/events).
  3. Diff baseline vs candidate: cost, latency, errors, and confidence (optional pricing catalog lines on top).
  4. Promote only when policy passes; optional human approval (request → confirm) before the ledger moves.

Example outcome

You ship a candidate whose system prompt drifts by a handful of tokens; under your imported tariffs the diff shows cost per run up ~31% while policy caps spend. flightdeck release promote (or the HTTP promote path) stays blocked until you change the model, relax policy with intent, or widen evidence — not because CI is slow, but because the governed ledger says no.

Who should use this?

  • Teams that version agent builds (prompts, tools, model pins) and need a durable audit trail.
  • Engineers who want one command to answer “is this candidate safe to roll forward?” with numbers, not gut feel.
  • Anyone who has outgrown ad hoc folder diffs or spreadsheet promote checklists.

How FlightDeck fits your stack

FlightDeck sits next to your agent runtime (not in the inference hot path): emit evidence, run flightdeck from a laptop or CI, gate promote with policy (and optional approval).

flowchart LR
  subgraph runtime [Your agent runtime]
    agent[Agent or service]
  end
  subgraph fd [FlightDeck workspace]
    ingest[Ingest RunEvents]
    ledger[(SQLite ledger)]
    diff[release diff]
    promote[promote or rollback]
  end
  subgraph automation [Automation]
    ci[CI job or operator]
  end
  agent -->|"JSONL or HTTP events"| ingest
  ingest --> ledger
  ledger --> diff
  diff --> ci
  ci -->|"policy pass"| promote

Comparison at a glance

FlightDeck Langfuse Arize Phoenix / Cloud Git / CI alone
Primary job Release + promote governance for agents (ledger, diff, policy) Tracing, sessions, evals, LLM observability ML / model observability and monitoring Source control and generic pipelines
Immutable release artifact Yes (release.yaml + checksum) No No Only if you build it
Evidence + cost/latency diff Yes (runs + pricing tables / optional catalog) Different lens (trace-level) Different lens DIY
Policy gate on promote First-class No No DIY

Try the UI: run flightdeck serve, then open http://127.0.0.1:8765/ — Overview, Diff, and Actions (see docs/web-ui.md).

Why it exists

Small prompt or model changes can silently move cost, latency, and error rate. FlightDeck turns those moves into explicit promote decisions backed by ingested runs — before production pointers advance.

Current local spine: versioned release.yaml + checksums · RunEvent ingest (JSONL or arrays) · immutable pricing imports · flightdeck release diff · policy-gated release promote / rollback · full audit history.

Status

FlightDeck is local-first and ships as a Python CLI backed by SQLite.

v1.0.0 froze SemVer-stable public contracts for the documented CLI, committed schemas/v1/, and POST /v1/events with api_version v1. v1.1.x adds Phase 1 slices (optional pricing catalog on diffs, promotion request/confirm, read-only runs listing, GET /v1/workspace for UI and automation, Helm/fleet examples) without breaking those v1.0 shapes. See RELEASE_NOTES.md and CHANGELOG.md. The product scope is still intentionally narrow (release governance, not a hosted agent platform).

Not implemented yet:

  • hosted control plane
  • automated traffic routing
  • tool-cost pricing
  • OpenTelemetry import/export mapping (optional uv sync --extra telemetry or pip install 'flightdeck-ai[telemetry]' for future work)

Shipped locally:

  • flightdeck serve + JSON routes under /v1/* (read + diff/promote/rollback + event ingest); see Local HTTP API below
  • minimal Python SDK (flightdeck.sdk.client)
  • flightdeck release rollback (policy-gated, audited)
  • optional promotion_requires_approval in flightdeck.yaml with POST /v1/promote/request and POST /v1/promote/confirm

Local HTTP API

With flightdeck serve (default bind 127.0.0.1), the app exposes GET /health, GET /v1/workspace (read-only workspace flags for scripts and the bundled UI), GET /v1/metrics, GET /v1/releases, GET /v1/promoted, GET /v1/actions, GET /v1/promotion-requests, GET /v1/runs, POST /v1/events, POST /v1/diff, POST /v1/promote, POST /v1/promote/request, POST /v1/promote/confirm, and POST /v1/rollback. POST /v1/promote, POST /v1/promote/request, POST /v1/promote/confirm, and POST /v1/rollback accept requests only from loopback clients unless FLIGHTDECK_LOCAL_API_TOKEN is set, in which case callers must send Authorization: Bearer <token> (same behavior as the web/ dev UI via VITE_FLIGHTDECK_LOCAL_API_TOKEN). See docs/http-api.md and SECURITY.md.

Quickstart

Install uv, then from the repo root:

uv sync --extra dev
uv run flightdeck --help

Or with pip and a venv:

python -m venv .venv
python -m pip install -e ".[dev]"
flightdeck --help

Run the cross-platform quickstart smoke (same as CI):

uv run flightdeck-quickstart-verify

(or python -m flightdeck.quickstart_smoke / python scripts/quickstart_smoke.py inside an activated venv)

Or use the bash wrapper (Git Bash / WSL on Windows):

./scripts/smoke.sh

Or walk through the core commands:

flightdeck init
flightdeck pricing import examples/quickstart/pricing-baseline.yaml
flightdeck pricing import examples/quickstart/pricing-candidate.yaml
flightdeck policy set examples/quickstart/policy.yaml

BASELINE=$(flightdeck release register examples/quickstart/baseline-release)
CANDIDATE=$(flightdeck release register examples/quickstart/candidate-release)

sed "s/__BASELINE_RELEASE_ID__/${BASELINE}/g" examples/quickstart/baseline-events.jsonl > baseline-events.jsonl
sed "s/__CANDIDATE_RELEASE_ID__/${CANDIDATE}/g" examples/quickstart/candidate-events.jsonl > candidate-events.jsonl

flightdeck runs ingest baseline-events.jsonl
flightdeck runs ingest candidate-events.jsonl

flightdeck release diff "$BASELINE" "$CANDIDATE" --window 7d
flightdeck release promote "$BASELINE" --env local --window 7d --reason "initial baseline"
flightdeck release history --agent agent_support --env local

The static event files in examples/quickstart use placeholder release IDs so the repo can ship stable examples. Substitute them before ingestion, or run uv run flightdeck-quickstart-verify / python -m flightdeck.quickstart_smoke (venv) or ./scripts/smoke.sh from Git Bash/WSL on Windows.

Examples: examples/quickstart/ · examples/ci/ (policy gate + Actions) · examples/deploy/ (serve via Docker/Compose) · examples/integration/ (HTTP event emitter).

Documentation

Development

uv sync --frozen --extra dev
uv run python -m ruff check src tests
uv run python -m pytest
uv run flightdeck-quickstart-verify
uv run flightdeck --help

If you change web/ or Pydantic models, also run the static/ and schemas/ drift checks from DEVELOPMENT.md (same gates as .github/workflows/ci.yml). AGENTS.md and .cursor/rules/flightdeck-ci-artifacts.mdc summarize them for humans and Cursor.

See DEVELOPMENT.md for uv and pip setup, verification, troubleshooting, and PyPI releases (tag-driven; not on merge to main).

License

FlightDeck is licensed under the Apache License, Version 2.0 — see LICENSE and NOTICE.

The canonical public repository: https://github.com/flightdeckdev/flightdeck.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flightdeck_ai-1.1.2.tar.gz (301.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flightdeck_ai-1.1.2-py3-none-any.whl (131.6 kB view details)

Uploaded Python 3

File details

Details for the file flightdeck_ai-1.1.2.tar.gz.

File metadata

  • Download URL: flightdeck_ai-1.1.2.tar.gz
  • Upload date:
  • Size: 301.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flightdeck_ai-1.1.2.tar.gz
Algorithm Hash digest
SHA256 cf6c609bc2d169d6178bff9df16f1a6f6de2c56461667986693708100b0f5a5b
MD5 fa7a47466a1f5b9320d3f6276608c6ad
BLAKE2b-256 ccceed233e8692797e8d669a3396808b8d10a43a04dc7e8e7a7ef382ffd864d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for flightdeck_ai-1.1.2.tar.gz:

Publisher: release-pypi.yml on flightdeckdev/flightdeck

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file flightdeck_ai-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: flightdeck_ai-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 131.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for flightdeck_ai-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3201865b630204170c15efc07d9b716f1ad4f3cd6947d1cf5f869148975e36f0
MD5 5c84f31b3ed1a13fc5ab1a98f49ce5be
BLAKE2b-256 87449d65ad920b1d522a7aac0269d8de3e657e65cd903d58016278490b7d5849

See more details on using hashes here.

Provenance

The following attestation bundles were made for flightdeck_ai-1.1.2-py3-none-any.whl:

Publisher: release-pypi.yml on flightdeckdev/flightdeck

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page