A safety-gated, streaming, tool-using agent SDK with hooks, subagents, structured output, multi-turn sessions, and MCP.

These details have not been verified by PyPI

Project links

Project description

operative

An AI-agnostic agent SDK and governed multi-backend router. By AXE Technologies.

One binary serves any stack, with Tier-0 data isolation, a full audit trail, and cost governance built in.

The core imports no vendor SDK and hardcodes no model. Backends are pluggable adapters behind protocols; a deployment YAML wires the stack. The same binary runs self-hosted in your own perimeter or as a managed service.

Why operative

AI-agnostic. The core is a clean protocol layer. Vendors (Anthropic, OpenAI, Ollama, MLX, vLLM, Qdrant, Postgres, Pinecone, Weaviate) live only inside adapters, imported lazily. Add Groq or Mistral with one adapter and a YAML line.
Governed by construction. Tier-0 data never leaves local hardware: a constitution enforces it before any backend runs, and fails fast and safe if a request would violate it. Every decision is replayable in an audit trail.
A spectrum of agency. Dial autonomy from a single deterministic call up to a fully autonomous agent with one named level. The catastrophic safety floor and the constitution hold identically at every level.
Cost governance. Per-tenant budgets, cheap-first cascade execution, an L0 response cache, and cost-anomaly detection.

Install

pip install operative-ai       # the import name stays `import operative`
pip install -e .               # from source

Quickstart: the SDK

from operative import Agent, tool, HTTPAgentModel

@tool
def add(a: int, b: int) -> int:
    "Add two integers."
    return a + b

agent = Agent(HTTPAgentModel("http://localhost:8095", "qwen"), tools=[add], workspace="./repo")
result = agent.run_sync("add 2 and 3, then write it to sum.txt")
print(result.answer)

Tool use with a safety floor, streaming, cost metering, self-correction, lifecycle hooks, subagent delegation, structured output, and MCP (stdio and HTTP) are wired in by default. See docs/SDK.md.

The router: serve any stack from a YAML

operative serve --deployment deployment.yaml

A deployment names its backends, routing policy, and governance. The same binary serves an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop running Ollama. The HTTP surface routes and governs every request:

POST /v1/route            route and execute one inference (tier-aware, governed)
GET  /v1/explain/{id}     replay the full decision trail for a request
GET  /v1/stats            per-backend latency and cost
GET  /healthz /info       liveness and backends

from operative import RoutedAgent, load_deployment, make_resolver

agent = RoutedAgent.from_deployment(load_deployment("deployment.yaml"), make_resolver())
result = await agent.run(request)        # routed, governed, audited

See docs/ROUTER.md and docs/OPERATIVEoptimizations.md.

What's in the box

Every innovation, with a short read and a link to its deep-dive.

AI-agnostic agent SDK. Tool use with a safety floor, streaming, self-correction, lifecycle hooks, structured output, multi-turn conversations with token-budget compaction, subagent delegation, MCP over both stdio and HTTP, and a human-in-the-loop approval channel. One import (from operative import Agent, tool) is the whole adoption surface. See docs/SDK.md.

Governed multi-backend router. operative serve --deployment deployment.yaml turns a single YAML into a routing HTTP server. The same binary serves an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop on Ollama. Inference backends: Anthropic, OpenAI (and any OpenAI-compatible server: vLLM, LM Studio, Together, Groq), Ollama, MLX. Knowledge backends: Qdrant, Postgres, Pinecone, Weaviate, plus two embedders. All are lazy-imported adapters behind a registry. See docs/ROUTER.md.

Governance by construction. A constitution checks every request before any backend runs and fails fast: Tier-0 data isolation (sensitive data never leaves owned hardware), plus cost and latency ceilings. Below it sits the IronGate catastrophic floor, which denies the truly dangerous (rm -rf /) in every mode. Permission modes (auto / ask / acceptEdits / plan) and a spectrum of named agency levels dial autonomy from a single deterministic call to a fully autonomous loop, with the floor and constitution holding identically at each level. Every decision is replayable via explain(). See docs/ROUTER.md and docs/SDK.md.

Capabilities: the third governance axis. Where agency governs how many human checkpoints and data tier governs classification, capabilities govern which surfaces a deployment may touch at all (shell, network egress, filesystem write, browser eval). A CapabilityManifest and named TrustTier presets (locked / standard / trusted) are checked by a constitutional rule before execution, giving a TrustTier x AgencyLevel x DataTier cube that makes postures like "fully autonomous but think-only" first-class. Unconfigured deployments grant everything (backward compatible); naming a tier fails closed. See docs/CAPABILITIES.md.

Neo: learned routing. An optional learned head ranks the backends the policy already deemed eligible (the Fugu insight that model selection beats model quality), for one cheap scoring pass per request. The head proposes; the constitution still disposes, so Neo can never widen the allowed set. A reference offline trainer mines operative's own audit log into ranked examples and fits the head. See docs/NEO.md.

Speech. A backend family for voice: pluggable STT, TTS, and VAD adapters (local faster-whisper / Kokoro / Silero are Tier-0-safe; OpenAI-compatible engines plug in too), a realtime duplex SpeechSession with barge-in interruption, and an OpenAI-compatible serve surface (operative serve --speech, exposing /v1/audio/transcriptions and /v1/audio/speech). See docs/SPEECH.md.

Browser automation. A backend family for the web: a curated ~16-primitive BrowserBackend protocol (navigate, scrape, search, screenshot, act, script) with surfboard and Safari adapters, surfaced to an agent as governed browser_tools so navigation and JavaScript execution pass through the capability and safety gates. See docs/BROWSER.md.

Optimization and learning. Opt-in and inert when unset: an L0 response cache, per-tenant budgets with cheapest-first cascade, adaptive failure memory, and a bandit explore/exploit reorderer - all running inside the governance envelope. See docs/OPERATIVEoptimizations.md.

Observability and auth. OTLP traces, metrics, and logs without the heavy opentelemetry-sdk dependency; a Prometheus surface; usage and cost metering with anomaly detection. Auth is a pluggable authenticator with push step-up for sensitive operations (Tier-0 or over-budget), including an AuthGate adapter.

Multi-tenancy. Per-tenant workspaces, knowledge scoping, and budgets, plus a white-label embeddable app (build_app) so one image serves many branded tenants from config and keys.

Command line

operative run --model <url> --workspace ./repo "task"   # run a task to completion
operative serve --deployment deployment.yaml            # the governed router HTTP server
operative serve --deployment deployment.yaml --speech   # the OpenAI-compatible voice server
operative serve --config agent.yaml --api-keys-file k   # the white-label tool-use agent
operative capabilities --deployment customer.yaml       # audit what a config permits
operative capabilities --deployment new.yaml --diff old.yaml   # capability/backend delta

Architecture

+---------------------------------------------------------------+
|  request                                                      |
|    -> authenticate        credential to a tenant principal    |
|    -> route               policy picks a backend chain        |
|    -> constitution        Tier-0 / cost / latency, fail fast  |
|    -> execute             run the chain with fallback         |
|    -> audit + metrics     every decision logged and measured  |
+---------------------------------------------------------------+
   backends and knowledge stores are pluggable adapters

Documentation

Full index in docs/README.md. The guides:

docs/SDK.md - the agent SDK: tools, MCP, structured output, hooks, agency
docs/ROUTER.md - the multi-backend router, backend matrix, and constitution
docs/CAPABILITIES.md - capability grants, trust tiers, the governance cube
docs/NEO.md - learned routing and the offline head trainer
docs/SPEECH.md - STT/TTS/VAD backends, the realtime SpeechSession, voice serve
docs/BROWSER.md - the browser backend family and governed browser tools
docs/OPERATIVEoptimizations.md - composing governance, optimization, and learning
docs/ERROR-HANDLING.md - error semantics
ARCHITECTURE.md - the request lifecycle and module map
examples/ - runnable examples and deployment configs

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

operative_ai-0.1.0.tar.gz (184.3 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

operative_ai-0.1.0-py3-none-any.whl (234.4 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file operative_ai-0.1.0.tar.gz.

File metadata

Download URL: operative_ai-0.1.0.tar.gz
Upload date: Jun 25, 2026
Size: 184.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for operative_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`98ba6fa38e27fe23c79b4002b2cb2b689c1e61a4ea7d35dc279e88f94f14d228`
MD5	`cf4954908c71a8396b8c0d575ac4cc49`
BLAKE2b-256	`455e99a3636cf4ad49228b99811a240ea2fc589a7687073852b7b376a17ca316`

See more details on using hashes here.

File details

Details for the file operative_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: operative_ai-0.1.0-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 234.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for operative_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c04a7acfffe9f93a48d1ccd546867a0fe497f27fe9ef76b98e6670b5f8b7ae47`
MD5	`23ed2eb28bb7e178bd646aa29290f7dc`
BLAKE2b-256	`8da4074ae5f30c548eb49b1ba8b9b4143a489a21fdc170bd5cfe8a10be5b24ce`

See more details on using hashes here.

operative-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

operative

Why operative

Install

Quickstart: the SDK

The router: serve any stack from a YAML

What's in the box

Command line

Architecture

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes