Ship AI agents safely with release diffs, runtime evidence, and policy gates.
Project description
FlightDeck
Ship AI agents safely with release diffs, runtime evidence, and policy gates.
FlightDeck is local-first (CLI + SQLite + optional flightdeck serve UI): run evidence, pricing tables, and the ledger stay on disk in your environment by default—no trace or billing payload is sent to FlightDeck as a vendor. That posture matters for regulated, air-gapped, and data-sovereignty teams that cannot ship telemetry to a third-party SaaS observability backend. It is not an agent framework, prompt IDE, tracing dashboard, or gateway — it is where what shipped, what ran, what it cost, and whether promote is allowed are recorded and compared.
In ~20 seconds
- Register immutable agent releases (
release.yaml+ bundle checksum). - Ingest run evidence (
RunEventJSONL orPOST /v1/events). - Diff baseline vs candidate: cost, latency, errors, and confidence (optional pricing catalog lines on top).
- Promote only when policy passes; optional human approval (request → confirm) before the ledger moves.
Example outcome
You ship a candidate whose system prompt drifts by a handful of tokens; under your tariffs the diff shows cost per run up ~31% while policy caps spend. flightdeck release promote (or the HTTP promote path) stays blocked until you change the model, relax policy with intent, or widen evidence — not because CI is slow, but because the governed ledger says no. (The ~31% story uses the two custom pricing YAMLs in examples/quickstart/; flightdeck init alone seeds a bundled snapshot so your first cost-aware diff does not start from an empty pricing ledger.)
Who should use this?
- Primary buyer / ICP: Platform or ML engineering teams (often 5–30 people) at growth-stage companies shipping two or more LLM agents to production—especially teams that already had a cost or regression incident from a prompt or model change and need a governed promote path.
- Teams that version agent builds (prompts, tools, model pins) and need a durable audit trail.
- Engineers who want one command to answer “is this candidate safe to roll forward?” with numbers, not gut feel.
- Healthcare, fintech, and enterprise operators who cannot default to sending traces or cost data to a hosted observability vendor—local-first evidence and pricing imports are the default integration model.
- Anyone who has outgrown ad hoc folder diffs or spreadsheet promote checklists.
How FlightDeck fits your stack
FlightDeck sits next to your agent runtime (not in the inference hot path): emit evidence, run flightdeck from a laptop or CI, gate promote with policy (and optional approval).
flowchart LR
subgraph runtime [Your agent runtime]
agent[Agent or service]
end
subgraph fd [FlightDeck workspace]
ingest[Ingest RunEvents]
ledger[(SQLite ledger)]
diff[release diff]
promote[promote or rollback]
end
subgraph automation [Automation]
ci[CI job or operator]
end
agent -->|"JSONL or HTTP events"| ingest
ingest --> ledger
ledger --> diff
diff --> ci
ci -->|"policy pass"| promote
Comparison at a glance
| FlightDeck | Langfuse | Arize Phoenix / Cloud | Git / CI alone | |
|---|---|---|---|---|
| Primary job | Release + promote governance for agents (ledger, diff, policy) | Tracing, sessions, evals, LLM observability | ML / model observability and monitoring | Source control and generic pipelines |
| Immutable release artifact | Yes (release.yaml + checksum) |
No | No | Only if you build it |
| Evidence + cost/latency diff | Yes (runs + pricing tables / optional catalog) | Different lens (trace-level) | Different lens | DIY |
| Default data residency | On your machine (CLI / SQLite / local HTTP) | Typically SaaS-hosted | Cloud offerings | Your repo |
| Policy gate on promote | First-class | No | No | DIY |
Try the UI: run flightdeck serve, then open http://127.0.0.1:8765/ — Overview, Diff, and Actions (see docs/web-ui.md).
Why it exists
Small prompt or model changes can silently move cost, latency, and error rate. FlightDeck turns those moves into explicit promote decisions backed by ingested runs — before production pointers advance.
Current local spine: versioned release.yaml + checksums · RunEvent ingest (JSONL or arrays) · bundled default pricing on flightdeck init (plus optional pricing import) · flightdeck release diff · policy-gated release promote / rollback · full audit history.
Status
FlightDeck is local-first and ships as a Python CLI backed by SQLite.
v1.0.0 froze SemVer-stable public contracts for the documented CLI, committed schemas/v1/,
and POST /v1/events with api_version v1. v1.1.x adds catalog-aware diffs, approval flows, and forensics slices (optional pricing catalog on diffs,
promotion request/confirm, read-only runs listing, GET /v1/workspace for UI and automation, Helm/fleet examples)
without breaking those v1.0 shapes. v1.2.0 raises the Python floor to 3.11+, tightens Bearer gating for POST /v1/events and GET /v1/* when FLIGHTDECK_LOCAL_API_TOKEN is set, adds optional PostgreSQL, bundled default pricing on flightdeck init, and experimental flightdeck.integrations. See RELEASE_NOTES.md and CHANGELOG.md.
The product scope is still intentionally narrow (release governance, not a hosted agent platform).
Maintenance and sustainability: the project is Apache-2.0 with no required commercial license. If FlightDeck matters to your production stack, use SUPPORT.md for security, commercial, and sponsorship pointers, and the Sponsor affordance on github.com/flightdeckdev/flightdeck when it is enabled—signals like that answer “what happens if maintenance stops?” more credibly than roadmap prose alone.
Not implemented yet:
- hosted control plane
- automated traffic routing
- tool-cost pricing
- OpenTelemetry import/export mapping (optional
uv sync --extra telemetryorpip install 'flightdeck-ai[telemetry]'for future work)
Shipped locally:
flightdeck serve+ JSON routes under/v1/*(read + diff/promote/rollback + event ingest); see Local HTTP API below- minimal Python SDK (
flightdeck.sdk.client) flightdeck release rollback(policy-gated, audited)- optional
promotion_requires_approvalinflightdeck.yamlwithPOST /v1/promote/requestandPOST /v1/promote/confirm
Local HTTP API
With flightdeck serve (default bind 127.0.0.1), the app exposes GET /health, GET /v1/workspace
(read-only workspace flags for scripts and the bundled UI), GET /v1/metrics, GET /v1/releases, GET /v1/promoted, GET /v1/actions, GET /v1/promotion-requests, GET /v1/runs, POST /v1/events, POST /v1/diff, POST /v1/promote, POST /v1/promote/request, POST /v1/promote/confirm, and POST /v1/rollback. POST /v1/promote, POST /v1/promote/request, POST /v1/promote/confirm, POST /v1/rollback, and POST /v1/events accept requests only from loopback clients unless FLIGHTDECK_LOCAL_API_TOKEN is set, in which case callers must send Authorization: Bearer <token>; when that token is set, the same Bearer header is required for GET /v1/* read APIs (bundled UI via VITE_FLIGHTDECK_LOCAL_API_TOKEN). POST /v1/diff stays unauthenticated. See docs/http-api.md and SECURITY.md.
Quickstart
Install uv, then from the repo root:
uv sync --extra dev
uv run flightdeck --help
Or with pip and a venv:
python -m venv .venv
python -m pip install -e ".[dev]"
flightdeck --help
Run the cross-platform quickstart smoke (same as CI):
uv run flightdeck-quickstart-verify
(or python -m flightdeck.quickstart_smoke / python scripts/quickstart_smoke.py inside an activated venv)
Or use the bash wrapper (Git Bash / WSL on Windows):
./scripts/smoke.sh
Bundled pricing (default init): flightdeck init migrates the ledger, imports OpenAI, Anthropic, and Google (Gemini-class) tables at pricing_version flightdeck-bundled-2026-05, and writes .flightdeck/pricing-catalog.yaml with pricing_catalog_path set in flightdeck.yaml. In release.yaml, set spec.pricing_reference to { provider: openai | anthropic | google, pricing_version: flightdeck-bundled-2026-05 } to get per-table and catalog cost lines on diffs without authoring YAML. These rates are a convenience snapshot, not live vendor billing—flightdeck pricing import your own files for production. Use flightdeck init --no-bundled-pricing for an empty ledger.
Or walk through the full quickstart (policy + two custom tariffs for the ~31% narrative—same flow CI runs):
flightdeck init # omit --no-bundled-pricing; bundled tables are additive with the imports below
flightdeck pricing import examples/quickstart/pricing-baseline.yaml
flightdeck pricing import examples/quickstart/pricing-candidate.yaml
flightdeck policy set examples/quickstart/policy.yaml
BASELINE=$(flightdeck release register examples/quickstart/baseline-release)
CANDIDATE=$(flightdeck release register examples/quickstart/candidate-release)
sed "s/__BASELINE_RELEASE_ID__/${BASELINE}/g" examples/quickstart/baseline-events.jsonl > baseline-events.jsonl
sed "s/__CANDIDATE_RELEASE_ID__/${CANDIDATE}/g" examples/quickstart/candidate-events.jsonl > candidate-events.jsonl
flightdeck runs ingest baseline-events.jsonl
flightdeck runs ingest candidate-events.jsonl
flightdeck release diff "$BASELINE" "$CANDIDATE" --window 7d
flightdeck release promote "$BASELINE" --env local --window 7d --reason "initial baseline"
flightdeck release history --agent agent_support --env local
The static event files in examples/quickstart use placeholder release IDs so the repo can ship stable examples.
Substitute them before ingestion, or run uv run flightdeck-quickstart-verify / python -m flightdeck.quickstart_smoke (venv) or ./scripts/smoke.sh from Git Bash/WSL on Windows.
Examples: examples/quickstart/ · examples/ci/ (policy gate + Actions) · examples/deploy/ (serve via Docker/Compose) · examples/integration/ (HTTP event emitter) · examples/integration/adoption/ (framework hooks).
Documentation
- CLI reference — all commands, flags, arguments, and exit codes
- HTTP API reference — all
/v1/*routes, request/response shapes, auth,RunEventfield reference - Python SDK —
FlightdeckClient/AsyncFlightdeckClientusage guide - Runtime integrations (experimental) — optional
flightdeck.integrationsmappers (LangChain, OpenAI Agents, Temporal, etc.) - Operations and policy — diff, promote, rollback internals; policy model and confidence tiers
- Release artifacts and pricing —
release.yamlformat, bundle layout, checksum algorithm, workspace config, pricing tables - Pricing catalog — optional
pricing_catalog_path, catalog vs imported tables, troubleshooting - JSON Schemas
- Release notes (maintainer)
- Roadmap
- Versioning
- Development
- Contributing
- Security
- Support and sustainability
- CLAUDE.md and AGENTS.md
Development
uv sync --frozen --extra dev
uv run python -m ruff check src tests
uv run python -m pytest
uv run flightdeck-quickstart-verify
uv run flightdeck --help
If you change web/ or Pydantic models, also run the static/ and schemas/ drift checks from DEVELOPMENT.md (same gates as .github/workflows/ci.yml). AGENTS.md and .cursor/rules/flightdeck-ci-artifacts.mdc summarize them for humans and Cursor.
See DEVELOPMENT.md for uv and pip setup, verification, troubleshooting, and PyPI releases (tag-driven; not on merge to main).
License
FlightDeck is licensed under the Apache License, Version 2.0 — see LICENSE and NOTICE.
The canonical public repository: https://github.com/flightdeckdev/flightdeck.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flightdeck_ai-1.2.0.tar.gz.
File metadata
- Download URL: flightdeck_ai-1.2.0.tar.gz
- Upload date:
- Size: 469.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7da6d7bff75443fa1e03a5f7a439f6fab962f163f4ffe92df81981ea5e0f361
|
|
| MD5 |
dbb99ae9e7406063ad6deff7c96533d0
|
|
| BLAKE2b-256 |
6ebcb476631705d9fe4d4479ee0c4ebe9332714975afbdf2d3742dc8314b573e
|
Provenance
The following attestation bundles were made for flightdeck_ai-1.2.0.tar.gz:
Publisher:
release-pypi.yml on flightdeckdev/flightdeck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flightdeck_ai-1.2.0.tar.gz -
Subject digest:
c7da6d7bff75443fa1e03a5f7a439f6fab962f163f4ffe92df81981ea5e0f361 - Sigstore transparency entry: 1435914772
- Sigstore integration time:
-
Permalink:
flightdeckdev/flightdeck@0453da38dbbe32102565826a588c466de7aaead5 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/flightdeckdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@0453da38dbbe32102565826a588c466de7aaead5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file flightdeck_ai-1.2.0-py3-none-any.whl.
File metadata
- Download URL: flightdeck_ai-1.2.0-py3-none-any.whl
- Upload date:
- Size: 158.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2faaaa4ad403716435d64281aea48b0884f943c1b60a2f96f9399b02b72ae92b
|
|
| MD5 |
63766667df4577e9f2a071402b5129e7
|
|
| BLAKE2b-256 |
28368f29cb51cc2e99f38ec703d5068af3fd549425b7a37aee42f82cecb9f755
|
Provenance
The following attestation bundles were made for flightdeck_ai-1.2.0-py3-none-any.whl:
Publisher:
release-pypi.yml on flightdeckdev/flightdeck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
flightdeck_ai-1.2.0-py3-none-any.whl -
Subject digest:
2faaaa4ad403716435d64281aea48b0884f943c1b60a2f96f9399b02b72ae92b - Sigstore transparency entry: 1435914778
- Sigstore integration time:
-
Permalink:
flightdeckdev/flightdeck@0453da38dbbe32102565826a588c466de7aaead5 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/flightdeckdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@0453da38dbbe32102565826a588c466de7aaead5 -
Trigger Event:
push
-
Statement type: