Skip to main content

CI for agent side effects.

Project description

Mirage

Mirage is CI for agent side effects.

Mirage review console and workflow preview

Screenshot: Mirage review console over a risky procurement run trace.

CI License: MIT

It sits between an agent and external APIs, intercepts outbound HTTP actions, evaluates them against policy, returns safe mocked responses, and writes deterministic traces for tests and CI.

Why Mirage Exists

Agents do not just generate text. They create tickets, submit bids, call billing APIs, and mutate real systems through HTTP.

A bad retry, hallucinated route, or out-of-policy payload can create duplicate charges, leak data, or ship a regression that only shows up after merge. Mirage sits between your agent and its APIs, checks outbound calls against declarative policy, returns safe mocked responses, and turns the run into deterministic traces you can fail in tests and CI before bad actions ship.

Positioning

Mirage is strongest today as a testing and CI layer for outbound agent HTTP actions.

It is not trying to be a generic runtime guard for production traffic. The clearest wedge right now is: catch risky agent actions before they merge.

How Mirage Is Different

Many adjacent tools focus on runtime protection: intercept a live tool call, score or inspect it in the loop, and decide whether to allow it.

Mirage is different:

  • it is built for pre-merge testing and CI, not live production arbitration
  • it uses declarative policy plus mocked responses, not a model-in-the-loop safety judge as the primary control
  • it optimizes for deterministic traces, reproducible failures, and build gates that engineers can trust in tests and CI

That is the product wedge: runtime guards try to protect live traffic; Mirage tries to stop risky action regressions before they ship.

Mirage vs. Adjacent Tools

Mirage overlaps in surface area with several well-loved tools. The wedge is different in each case:

  • pytest-httpx / respx: per-test httpx mocking with response stubs. Great for unit-testing a single function's HTTP behavior. Mirage is run-scoped, not per-test: one MirageSession spans an entire agent run, enforces declarative policies (not just response stubs), and writes a trace you can gate CI on via assert_clean() or mirage gate-run.
  • pytest-httpserver: a real local HTTP server you can assert against in tests. Mirage also intercepts outbound HTTP, but evaluates each call against a shared policies.yaml and emits a four-outcome taxonomy (allowed / policy_violation / unmatched_route / config_error) with response headers the agent's own assertions can read.
  • VCR.py: record-and-replay cassettes of real HTTP interactions. Excellent for regression-locking an existing integration. Mirage does not record; it enforces policy on synthetic mocks so a brand-new risky action (an agent hallucinating a route, exceeding a bid limit) is caught on its first appearance, not only after a cassette exists.
  • responses: requests-era monkeypatch library. Mirage is httpx-native and session-oriented; if your agent stack is on httpx, Mirage slots in without a transport rewrite.
  • WireMock / mitmproxy: general-purpose mock servers and intercepting proxies. Mirage is narrower and opinionated: declarative policy + mocks + deterministic trace + assert_clean(), tuned for LLM-agent side-effect review rather than generic HTTP stubbing.
  • Runtime LLM-judge guards (NeMo Guardrails, Llama Guard, policy agents): arbitrate live tool calls in production with a model in the loop. Mirage is pre-merge and deterministic — rules, not judgments — so CI can fail the build before a risky action ever ships.

When not to use Mirage today

  • Your agent isn't Python, or doesn't cross an HTTP boundary you control (httpx is the cleanest integration path today).
  • The side effects you care about are not HTTP (direct DB writes, filesystem mutation, subprocess calls).
  • You need live production arbitration of tool calls — that's the runtime-guard wedge, not Mirage's.
  • You have no pytest or CI step that can run the agent; Mirage's value is in failing a build, so without one there's nothing to gate.

Status: v0.1.0 is the first public alpha. The strongest supported path today is Python integrations through MirageSession, run-level CLI gates, and the bundled procurement harness. The review UI is real and useful, but still an alpha console surface rather than a finished product shell.

See It In 60 Seconds

If you want the fastest proof that Mirage is real:

make proxy-procurement
make procurement-demo-risky
python -m mirage.cli summarize-run --run-id procurement-risky-demo

You should see Mirage allow the supplier lookup, flag the bid submission as a policy_violation, and write a trace under artifacts/traces/.

Mirage run: procurement-risky-demo
Summary: 2 action(s), 1 safe, 1 risky
Risky actions:
- [policy_violation] POST /v1/submit_bid ...

Start Here

What Mirage Does Today

Mirage currently gives a Python-first developer workflow for:

  • config-driven HTTP mocks
  • config-driven policy checks
  • deterministic run-scoped traces
  • clear request outcomes for debugging and CI
  • an action review console over trace artifacts
  • a Python-first integration path, with httpx as the cleanest path today
  • local, test, and container-friendly execution

Mirage currently reports one of four outcomes for every intercepted request:

  • allowed
  • policy_violation
  • unmatched_route
  • config_error

Quickstart

Requires Python 3.11+.

pip install mirage-ci

The package installs as mirage-ci on PyPI and imports as mirage:

from mirage import MirageSession

It also exposes a mirage console script. If that script is not on your PATH, use python -m mirage.cli ... directly.

For a development checkout (editable install from source), see Contributing.

Integrate your own agent

The canonical Mirage integration is MirageSession. One run ID, an httpx client surface the agent uses directly, one assertion point for CI.

from mirage import MirageSession

with MirageSession(run_id="demo-run") as mirage:
    response = mirage.post(
        "/v1/submit_bid",
        json={"contract_id": "STANDARD-7", "bid_amount": 7500},
    )
    summary = mirage.assert_clean()
    print(summary.trace_path)

For the full 30-minute walkthrough of pointing Mirage at your own agent, see docs/FIRST_INTEGRATION.md. For CI gating recipes (pytest and GitHub Actions), see docs/CI_INTEGRATION.md.

Try the bundled procurement harness

If you want to see Mirage working on a realistic pre-built workflow before integrating your own agent:

make proxy-procurement

In a second terminal:

make procurement-demo-safe
make test-procurement

Run with Docker:

docker compose up --build

That Docker path starts the Mirage proxy with the procurement harness config on http://localhost:8000.

MirageSession

MirageSession is the recommended path for:

  • local developer runs
  • pytest integration tests
  • CI gates on risky actions

For agent code that already expects a client-like object:

from examples.procurement_harness.agent import ProcurementAgent
from mirage import MirageSession

with MirageSession(run_id="procurement-safe") as mirage:
    agent = ProcurementAgent(mirage)
    result = agent.run_compliant_bid_workflow()
    summary = mirage.assert_clean()
    print(result.action.mirage.outcome)
    print(summary.to_text())

Alternative: per-response primitives

If you want per-response access instead of a run-level session, the lower-level httpx primitives remain available:

from mirage.httpx_client import (
    assert_mirage_response_safe,
    create_mirage_client,
    mirage_response_report,
)

with create_mirage_client(run_id="demo-run") as client:
    response = client.post(
        "/v1/submit_bid",
        json={"contract_id": "STANDARD-7", "bid_amount": 7500},
    )
    report = mirage_response_report(response)
    assert_mirage_response_safe(response)
    print(report.trace_path)

Mirage adds response metadata headers so tests and agents can inspect what happened without changing the mocked response body:

  • X-Mirage-Run-Id
  • X-Mirage-Outcome
  • X-Mirage-Policy-Passed
  • X-Mirage-Trace-Path
  • X-Mirage-Matched-Mock
  • X-Mirage-Message
  • X-Mirage-Decision-Summary

CI Gating

Mirage now has a run-level CLI for CI or shell workflows:

make mirage-summary RUN_ID=procurement-risky-demo
make mirage-gate RUN_ID=procurement-risky-demo

Equivalent direct commands:

python -m mirage.cli summarize-run --run-id procurement-risky-demo
python -m mirage.cli gate-run --run-id procurement-risky-demo
python -m mirage.cli validate-config

gate-run exits non-zero when the run is risky or missing, so it can fail CI directly. validate-config exits non-zero when Mirage config is missing or malformed, so you can fail fast before starting the proxy.

For complete GitHub Actions and pytest recipes, see docs/CI_INTEGRATION.md.

If Your Agent Does Not Already Use httpx

Mirage does not require your whole stack to be built directly on httpx. It only needs the outbound action path to cross a client boundary you control.

  • If your SDK or framework lets you inject a base URL, transport, or HTTP client, point that boundary at Mirage.
  • If your orchestration layer hides HTTP completely, wrap the side-effecting calls in your own gateway and test that gateway with Mirage.
  • If you only need a starting point, intercept writes first: bids, orders, ticket creation, CRM updates, or billing actions.

See docs/INTEGRATION_PATTERNS.md for the concrete patterns.

Config

The primary onboarding config now lives in:

When you run Mirage from a repo checkout, local mocks.yaml and policies.yaml remain the default fallback config. Installed Mirage also ships bundled example defaults, so the CLI and proxy still boot outside the source tree.

Example policy:

policies:
  - name: enforce_bid_limit
    method: POST
    path: /v1/submit_bid
    field: bid_amount
    operator: lte
    value: 10000
    message: Agents cannot submit bids above the approved threshold.

Optional environment variables:

  • MIRAGE_PROXY_URL
  • MIRAGE_RUN_ID
  • MIRAGE_MOCKS_PATH
  • MIRAGE_POLICIES_PATH
  • MIRAGE_ARTIFACT_ROOT

Validate config before a local run or CI job:

make mirage-validate-config

Procurement Harness

The default onboarding path now lives in examples/procurement_harness/.

It gives one coherent workflow instead of isolated request demos:

  • look up an approved supplier
  • submit a compliant or risky bid
  • inspect Mirage outcomes and trace paths

Primary commands:

make proxy-procurement
make procurement-demo-safe
make procurement-demo-risky
make procurement-demo-unmatched
make test-procurement

Harness docs:

Action Review Console

Mirage currently ships two console surfaces over the same review backend:

  • demo_ui/: the shared FastAPI console API plus a zero-dependency legacy HTML shell
  • ui/: a richer Next.js operator client that consumes that API

Both read Mirage trace artifacts, show aggregate action metrics, surface recent risky runs, and let you drill into one run at a time.

The shared backend still supports the scenario launcher for founder demos, but the primary value of the console is now:

  • aggregate action counts across runs
  • review queue for recent runs that need attention
  • top endpoints by action volume
  • top policy failures
  • overview-first run detail with request, outcome, policy reasoning, and trace
  • per-run graph view for decision flow review

Start it with:

make demo-ui

Then open http://127.0.0.1:5100. Override the port with PORT=5101 make demo-ui if needed.

For the Next.js client:

make ui-install
make ui-dev-local

Then open http://127.0.0.1:3000.

For live demos, use the terminal-first script in docs/live-demo-script.md.

Example Scenarios

This repo now includes three canonical example flows:

Worklog

Create a new implementation review entry with:

make worklog TITLE="Short Task Title"

The template and index live in docs/worklog/.

Repo Structure

Supporting Docs

Contributing

Bug reports and pull requests are welcome. See CONTRIBUTING.md for the local dev loop and expectations, CODE_OF_CONDUCT.md for community standards, and SECURITY.md for private vulnerability reporting.

Source install

For a development checkout:

git clone https://github.com/ysham123/Mirage
cd Mirage
pip install setuptools wheel
pip install -e '.[dev]'

Or, with the bundled Makefile:

make install

The editable install exposes the mirage console script and the mirage Python package directly from your checkout.

License

Mirage is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mirage_ci-0.1.1.tar.gz (73.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mirage_ci-0.1.1-py3-none-any.whl (66.0 kB view details)

Uploaded Python 3

File details

Details for the file mirage_ci-0.1.1.tar.gz.

File metadata

  • Download URL: mirage_ci-0.1.1.tar.gz
  • Upload date:
  • Size: 73.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for mirage_ci-0.1.1.tar.gz
Algorithm Hash digest
SHA256 52e7beb1f1b71f218db698c8687f072de583bcb4e03d334b8d84048c7468946a
MD5 c05e1a2345355fa5eaac96b09178f249
BLAKE2b-256 68575fc098b8331a3060f22676c6cfeac04017b6906401746804903cea479347

See more details on using hashes here.

File details

Details for the file mirage_ci-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mirage_ci-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 66.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for mirage_ci-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c7236f3da4cee6502c25adc57eee6503a03eb1d3608321b04cdef24da18837d0
MD5 2e5b07280246b77a9b52434e62f6c124
BLAKE2b-256 f4f69dafb281185a4103c91bc5aba0e35362917e917ab9d42976d1b635b46ce2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page