Skip to main content

Error observability for the agent era

Project description

Bastion

Runtime context for coding agents — structured error observability built for the agent era

PyPI version Python 3.9+ License: MIT


The Problem

When a coding agent writes code that breaks, the default feedback loop looks like this: agent runs the code, something raises, the agent reads a wall of unstructured stdout or a raw traceback, guesses at the cause, asks you to add more logging, you re-run and paste output back. That loop is slow and lossy because each iteration discards context. Traditional error handling was designed for a human sitting at a terminal — one reader, interactive, patient. A coding agent is none of those things. It needs structured, queryable runtime context, not formatted text.

What Bastion Does

Bastion gives your coding agent structured access to what is actually happening at runtime. Four instrumentation primitives — guard(), checkpoint(), expect(), and breadcrumb() — replace traditional error handling with agent-readable structured records persisted to a local SQLite store. An MCP server exposes that store as ten queryable tools the agent calls on demand. No stdout parsing, no copy-pasting, no re-running with extra logs added.

Who It Is For

Bastion earns its place in:

  • Complex multi-step workflows with many moving parts where you need to reconstruct what happened between steps
  • Multi-agent orchestration layers with API calls, retries, and rate limits — anywhere the execution path is non-trivial
  • Long-running processes and background jobs that fail silently and leave no trail
  • Existing codebases where an agent needs runtime history to debug effectively, not just static analysis

It is not the right fit for:

  • Simple one-off scripts where reading the single traceback is sufficient
  • Greenfield code that works on the first run and has no meaningful failure modes yet

Installation

pip install bastion-agent

# httpx is used in the Quick Start example below — not a Bastion dependency
pip install httpx

Quick Start

import bastion
import httpx

bastion.init()


@bastion.guard(context=["agent_id", "endpoint", "retry_count"])
def call_api(agent_id: str, endpoint: str, retry_count: int = 0) -> dict:
    bastion.breadcrumb(
        f"agent {agent_id} calling {endpoint}",
        severity="info",
        tags=["api", agent_id],
    )

    response = httpx.get(endpoint, timeout=10)

    bastion.expect(
        response.status_code != 429,
        "Rate limit hit",
        context={"agent_id": agent_id, "endpoint": endpoint, "retry_count": retry_count},
    )

    bastion.checkpoint("api_flow", "call_succeeded", {
        "agent_id": agent_id,
        "endpoint": endpoint,
        "status_code": response.status_code,
        "retry_count": retry_count,
    })

    return response.json()

There are two distinct failure paths here. If httpx.get() raises a network exception, execution leaves call_api() immediately — guard() catches it, records the exception type, message, source location, and the three named locals (agent_id, endpoint, retry_count), then re-raises. expect() and checkpoint() are never reached in that path. If the request succeeds but returns a 429, execution reaches expect(), which persists the failed assertion before raising AssertionError — and guard() then catches that too, so you get both records. Every breadcrumb() call fires regardless of which path is taken. Your agent queries all of it without re-running anything.


MCP Server Setup

Starting the server

# via module
python -m bastion

# or via the console script installed with pip
bastion-mcp

Claude Code configuration

Run:

claude mcp add-json bastion '{"type":"stdio","command":"bastion-mcp","args":[]}'

Verify:

claude mcp get bastion

Cursor configuration

Add to your Cursor MCP settings (~/.cursor/mcp.json):

{
  "mcpServers": {
    "bastion": {
      "command": "bastion-mcp",
      "args": []
    }
  }
}

Available MCP tools

Tool Description Key parameters
get_summary High-level snapshot: error count, most frequent error, last checkpoint, last breadcrumb, failed expectation count
get_recent_errors Most recently seen errors, newest first (summary fields only) limit (default 10)
get_error_detail Full error record including captured locals and hint error_id
get_errors_by_fingerprint Error record for a given SHA256 fingerprint fingerprint
get_checkpoints Checkpoint records ordered oldest-first for execution tracing flow, run_id, limit
get_flows All distinct flow names recorded in checkpoints
get_failed_expectations Failed expectations only, newest first limit (default 10)
get_expectations All expectations (passed and failed), newest first limit (default 20)
get_breadcrumbs Event markers ordered oldest-first for chronological tracing severity, tags (comma-separated), limit
clear_all Delete all records from every table — use between test runs

Start a debugging session with get_summary to orient, then drill into the relevant table.


The Four Primitives

guard()

Wraps a function in structured exception capture. When the wrapped function raises, guard() records the exception type, message, source location, and named locals (or all locals if context is omitted), then re-raises so your control flow is unaffected.

@bastion.guard(context=["user_id", "payload"])
def process_submission(user_id: str, payload: dict) -> dict:
    if not payload.get("items"):
        raise ValueError("submission has no items")
    return submit(payload)

Replaces: a bare try/except that either swallows the error or prints an unstructured traceback.


checkpoint()

Records a named step within a named logical flow. Use it to mark the boundary between stages in a multi-step process so an agent can reconstruct what completed before a failure.

bastion.checkpoint("document_pipeline", "chunking_complete", {
    "doc_id": doc.id,
    "chunk_count": len(chunks),
    "run_id": run_id,
})

Replaces: print("step 3 done") statements that vanish from context and can't be queried.


expect()

Asserts a condition and persists the result regardless of outcome. Failed expectations raise AssertionError after writing the record, so agents can query "what invariants broke during this run" without grepping logs.

bastion.expect(
    len(search_results) > 0,
    "search must return at least one result",
    context={"query": query, "index": index_name},
)

Replaces: bare assert statements that raise but leave no queryable record behind.


breadcrumb()

Records an ambient event marker with no frame capture or condition checking. Use it for high-frequency events where you want chronological tracing without the overhead of exception handling.

bastion.breadcrumb(
    f"rate limiter sleeping {backoff:.1f}s",
    severity="warning",
    tags=["rate_limit", agent_id],
)

Replaces: logger.info() calls that produce unstructured output an agent has to parse.


How It Works

The library captures structured records at the point of instrumentation and persists them to a local SQLite database at ~/.bastion/bastion.db. The MCP server reads from that same database and exposes the records as typed tool responses. When an agent needs to understand a failure, it calls the appropriate MCP tool — get_recent_errors, get_checkpoints, or get_failed_expectations — and receives a structured list it can reason about directly. There is no daemon process, no network call, and no background sync. Everything runs locally and the database is a single file you can inspect with any SQLite browser.


Roadmap

Version Milestone Status
v0.1.0 Package skeleton — typed stubs, correct public API, importable
v0.2.0 SQLite persistence, guard(), checkpoint(), expect(), breadcrumb()
v0.4.0 MCP server with 10 query tools, bastion-mcp entry point
v1.0.0 Node.js port, full documentation site, MCP registry listing planned
v2.0.0 Team mode with Turso DB, per-table clear tools, opt-out variable capture planned

Contributing

Bastion is early-stage and the API is not frozen. If you hit a rough edge or have a strong opinion about how the instrumentation primitives should behave, opening an issue is the most useful thing you can do. Pull requests are welcome for bug fixes, documentation improvements, and new MCP tools. Feedback on the API design — naming, signatures, what gets persisted — is especially valuable right now, before v1.0.0 locks things in.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bastion_agent-0.4.0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bastion_agent-0.4.0-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file bastion_agent-0.4.0.tar.gz.

File metadata

  • Download URL: bastion_agent-0.4.0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for bastion_agent-0.4.0.tar.gz
Algorithm Hash digest
SHA256 397a0ce85f3124822d7f261df667321c557b259ed81c3833d01ef5da3248d93e
MD5 b9bfcc9cb90d41426c6365866d205347
BLAKE2b-256 3bc6e1dad57a83e061b58d4784fe606c99ddd348bb3ffaa393794d292ce081b3

See more details on using hashes here.

File details

Details for the file bastion_agent-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: bastion_agent-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for bastion_agent-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e0564ae8de4393055b15f666e41366be55614e2b44be6621d8cc0dfa0bcd5981
MD5 591943e1efc99d03acf7c857a938444b
BLAKE2b-256 32b74d316026773e4bc80873d47d388a5b7407a0ec2bb23cfb34f0bb20903b65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page