Skip to main content

Self-expanding, self-repairing, human-correctable tool infrastructure for AI agents

Project description

Toolwright

Give your AI agent any API. Safely.

Toolwright turns API traffic into governed MCP tools and keeps them working. Point it at an OpenAPI spec, a HAR file, or a live web app; it compiles typed tool definitions, classifies every tool by risk, and serves them through an MCP server that enforces approval gates, circuit breakers, and behavioral rules at runtime. Your credentials are automatically redacted from captured traffic. Nothing runs without your explicit sign-off.

Toolwright hero demo

What It Looks Like

$ toolwright mint https://dashboard.stripe.com -a api.stripe.com
   Captured 47 API calls across 12 endpoints
   Compiled 12 tools (8 read, 3 write, 1 admin)
   Risk classified: 3 low, 6 medium, 2 high, 1 critical
   Auth detected: Bearer token (Authorization header)
   Credentials redacted from captured traffic

  Set before serving:
    export TOOLWRIGHT_AUTH_API_STRIPE_COM="Bearer <your-token>"

$ toolwright gate allow --all        # review and approve every tool
$ toolwright serve                   # start the governed MCP server
  12 governed tools ready

The -a flag specifies which host to capture -- only traffic to that host is recorded. For OpenAPI specs and HAR files, the host is detected automatically.

# Later, the API changes under you:
$ toolwright repair plan
  SAFE (auto-apply):
    + update_user: response field added (role)
  APPROVAL_REQUIRED:
    ~ delete_user: path changed /users/{id} -> /v2/users/{id}

$ toolwright repair apply
  Applied 1 safe patch. 1 queued for your review.

No paths to memorize. Toolwright auto-detects your toolpack when there's only one. For multiple toolpacks, run toolwright use stripe to set a default.

Try It Now

pip install toolwright
toolwright demo

Compiles a governed toolpack from bundled traffic, enforces fail-closed gates, and writes a full audit log. Exit 0 means every safety check passed.

Commands

Getting started:

toolwright demo          # see it work (60 seconds)
toolwright ship          # build + approve + serve (your API)
toolwright ship <url>    # one-command onboarding from URL
toolwright serve         # run your governed MCP server (stdio)
toolwright serve --http  # serve over HTTP with web dashboard

Operations:

toolwright drift         # check for API changes
toolwright repair plan   # see what needs fixing
toolwright repair apply  # apply fixes
toolwright kill <tool>   # emergency stop a tool
toolwright quarantine    # list stopped tools
toolwright watch status  # reconciliation status

Sharing & notifications:

toolwright share <toolpack>   # package into signed .twp bundle
toolwright install <file.twp> # verify + install a shared bundle

All commands: toolwright --help

Why This Exists

APIs change silently. Tools break with no warning. Nobody knows until the agent starts failing. And giving an AI agent API access today still means writing MCP tool definitions by hand, then hoping it doesn't call a destructive endpoint.

Toolwright closes that gap. It compiles tools from real API traffic, classifies them by risk, enforces approval gates before anything can run, and automatically circuit-breaks tools that start failing -- before the failures cascade to your agent.

How It Stays Safe

Secrets are redacted before anything reaches disk. Captured traffic is redacted in memory -- tokens, cookies, API keys, and PII are stripped before toolpacks, logs, and evidence bundles are written. Auth is injected at runtime via environment variables, never stored in any artifact Toolwright produces.

Nothing runs without approval. Toolwright is fail-closed. Every tool must pass through a gate review before it can execute. The approval is cryptographically signed and recorded in a tamper-evident lockfile. If a tool isn't explicitly approved, it doesn't run. There is no "allow by default" mode.

You see everything before it ships. During compilation, every tool is classified by risk tier -- critical (destructive operations), high (writes), medium (sensitive reads), low (read-only). You approve tools individually or by tier, with full visibility into what each one does.

Agents propose, you decide. Agents can propose new API capabilities and suggest behavioral rules through MCP meta-tools. Both create DRAFT proposals that require your explicit activation. The agent never gains a capability it didn't ask for, and you never approve something you haven't reviewed.

When APIs Break

Drift detection and repair. When an API changes under you, drift detection catches it and repair proposes classified fixes -- safe (auto-apply), approval-required, or manual:

toolwright drift                           # detect what changed
toolwright repair plan                     # Terraform-style diff
toolwright repair apply                    # apply with confirmation

Continuous reconciliation. Start the MCP server with --watch and Toolwright monitors every tool on a risk-tier schedule. When drift is detected, safe patches auto-apply; risky ones queue for your review:

toolwright serve --watch --auto-heal safe
toolwright watch status                    # see per-tool health

Snapshots and rollback. Every auto-repair is preceded by a snapshot. If something goes wrong, restore the exact previous state:

toolwright snapshots                       # list available snapshots
toolwright rollback <snapshot-id>          # restore

Runtime Safety

Circuit breakers. When an API starts failing, per-tool circuit breakers trip automatically after repeated errors, blocking further calls until the API recovers. You can also kill or re-enable tools manually:

toolwright kill search_api --reason "Upstream 500s"
toolwright quarantine             # see what's killed and why
toolwright enable search_api      # bring it back

Circuit breaker lifecycle demo

Behavioral rules. Define constraints that persist across agent sessions -- no retraining, no prompt engineering. When a rule is violated, the agent gets structured feedback explaining what went wrong and how to proceed:

toolwright rules add --kind prerequisite --target update_issue \
  --requires get_repo --description "Read context before modifying"

toolwright rules add --kind prohibition --target delete_contents \
  --description "Never delete repository files"

Six rule types: prerequisites, prohibitions, parameter constraints, rate limits, call sequencing, and approval gates. Agents can suggest new rules via MCP meta-tools. Suggestions start as DRAFT and require toolwright rules activate before taking effect.

Start Where You Are

You have... Run
A web app toolwright mint https://app.example.com -a api.example.com
An OpenAPI spec toolwright capture import openapi.yaml
A HAR file from DevTools toolwright capture import traffic.har
OpenTelemetry traces toolwright capture import traces.json --input-format otel
No idea toolwright ship

All paths converge: capture → compile → approve → serve.

What's Inside

Capability What It Does Maturity
Connect Compile MCP tools from any API source (browser, spec, HAR, OTEL) Stable
Govern Risk classification, cryptographic signing, approval gates, audit logging Stable
Heal Drift detection, auto-repair, continuous reconciliation, snapshots & rollback Stable (incl. reconciliation & auto-heal)
Kill Per-tool circuit breakers with auto-recovery and manual kill switches Stable
Correct Persistent behavioral rules with agent suggestion and human-gated activation Stable
Transport HTTP server, web dashboard, SSE live feed, token auth Stable
Share Signed .twp bundles for toolpack distribution Stable
Observe OTEL-compatible tracing, Prometheus metrics (no-op fallback) Stable
Notify Webhook notifications with Slack auto-detection Stable

87 capabilities. 2150+ tests.

Agents introspect their own governance via MCP meta-tools -- check risk summaries, diagnose failures, manage circuit breakers, and read behavioral rules. Agents can also propose new API capabilities and suggest behavioral rules; both create DRAFT proposals that require human approval before taking effect.

Install

pip install toolwright                 # core
pip install "toolwright[playwright]"   # + browser capture
pip install "toolwright[mcp]"          # + MCP server
pip install "toolwright[all]"          # everything

tw works as shorthand for toolwright. Full docs: docs/user-guide.md

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolwright-1.0.0a2.tar.gz (6.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toolwright-1.0.0a2-py3-none-any.whl (426.3 kB view details)

Uploaded Python 3

File details

Details for the file toolwright-1.0.0a2.tar.gz.

File metadata

  • Download URL: toolwright-1.0.0a2.tar.gz
  • Upload date:
  • Size: 6.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for toolwright-1.0.0a2.tar.gz
Algorithm Hash digest
SHA256 2cfa7d0f23afd34af12aa0ef3575eaeae2dd93565b5f388e69e1920b450cc6d3
MD5 889f0c6beaa3e0d50541485909f4ea20
BLAKE2b-256 639eea0eea3f8edfd03cdc9f021f039c79de4f6f0cd709e12daee3df537ce9a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for toolwright-1.0.0a2.tar.gz:

Publisher: publish.yml on toolwright/toolwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file toolwright-1.0.0a2-py3-none-any.whl.

File metadata

  • Download URL: toolwright-1.0.0a2-py3-none-any.whl
  • Upload date:
  • Size: 426.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for toolwright-1.0.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 8eb87094d4ad511cd22eb8e17f7b23f4e0a000d0b4885ee36e8101ed756b1c7d
MD5 229716c9c7e9a5b0e0812ca1b5df66bc
BLAKE2b-256 a1ead6c7dad601a93fe04c88ee9f8689ecac4073f881b3f2011fecc7bc6dafdd

See more details on using hashes here.

Provenance

The following attestation bundles were made for toolwright-1.0.0a2-py3-none-any.whl:

Publisher: publish.yml on toolwright/toolwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page