A safety-gated, streaming, tool-using agent SDK with hooks, subagents, structured output, multi-turn sessions, and MCP.
Project description
operative
An AI-agnostic agent SDK and governed multi-backend router. By AXE Technologies.
One binary serves any stack, with Tier-0 data isolation, a full audit trail, and cost governance built in.
The core imports no vendor SDK and hardcodes no model. Backends are pluggable adapters behind protocols; a deployment YAML wires the stack. The same binary runs self-hosted in your own perimeter or as a managed service.
Why operative
- AI-agnostic. The core is a clean protocol layer. Vendors (Anthropic, OpenAI, Ollama, MLX, vLLM, Qdrant, Postgres, Pinecone, Weaviate) live only inside adapters, imported lazily. Add Groq or Mistral with one adapter and a YAML line.
- Governed by construction. Tier-0 data never leaves local hardware: a constitution enforces it before any backend runs, and fails fast and safe if a request would violate it. Every decision is replayable in an audit trail.
- A spectrum of agency. Dial autonomy from a single deterministic call up to a fully autonomous agent with one named level. The catastrophic safety floor and the constitution hold identically at every level.
- Cost governance. Per-tenant budgets, cheap-first cascade execution, an L0 response cache, and cost-anomaly detection.
Install
pip install operative-ai # the import name stays `import operative`
pip install -e . # from source
Quickstart: the SDK
from operative import Agent, tool, HTTPAgentModel
@tool
def add(a: int, b: int) -> int:
"Add two integers."
return a + b
agent = Agent(HTTPAgentModel("http://localhost:8095", "qwen"), tools=[add], workspace="./repo")
result = agent.run_sync("add 2 and 3, then write it to sum.txt")
print(result.answer)
Tool use with a safety floor, streaming, cost metering, self-correction, lifecycle hooks, subagent delegation, structured output, and MCP (stdio and HTTP) are wired in by default. See docs/SDK.md.
The router: serve any stack from a YAML
operative serve --deployment deployment.yaml
A deployment names its backends, routing policy, and governance. The same binary serves an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop running Ollama. The HTTP surface routes and governs every request:
POST /v1/route route and execute one inference (tier-aware, governed)
GET /v1/explain/{id} replay the full decision trail for a request
GET /v1/stats per-backend latency and cost
GET /healthz /info liveness and backends
from operative import RoutedAgent, load_deployment, make_resolver
agent = RoutedAgent.from_deployment(load_deployment("deployment.yaml"), make_resolver())
result = await agent.run(request) # routed, governed, audited
See docs/ROUTER.md and docs/OPERATIVEoptimizations.md.
What's in the box
Every innovation, with a short read and a link to its deep-dive.
AI-agnostic agent SDK. Tool use with a safety floor, streaming, self-correction,
lifecycle hooks, structured output, multi-turn conversations with token-budget compaction,
subagent delegation, MCP over both stdio and HTTP, and a human-in-the-loop approval channel.
One import (from operative import Agent, tool) is the whole adoption surface.
See docs/SDK.md.
Governed multi-backend router. operative serve --deployment deployment.yaml turns a
single YAML into a routing HTTP server. The same binary serves an internal MLX-plus-cloud
stack, a customer on OpenAI plus Pinecone, or a laptop on Ollama. Inference backends:
Anthropic, OpenAI (and any OpenAI-compatible server: vLLM, LM Studio, Together, Groq),
Ollama, MLX. Knowledge backends: Qdrant, Postgres, Pinecone, Weaviate, plus two embedders.
All are lazy-imported adapters behind a registry. See docs/ROUTER.md.
Governance by construction. A constitution checks every request before any backend runs
and fails fast: Tier-0 data isolation (sensitive data never leaves owned hardware), plus
cost and latency ceilings. Below it sits the IronGate catastrophic floor, which denies the
truly dangerous (rm -rf /) in every mode. Permission modes (auto / ask / acceptEdits /
plan) and a spectrum of named agency levels dial autonomy from a single deterministic call
to a fully autonomous loop, with the floor and constitution holding identically at each
level. Every decision is replayable via explain(). See docs/ROUTER.md
and docs/SDK.md.
Capabilities: the third governance axis. Where agency governs how many human checkpoints
and data tier governs classification, capabilities govern which surfaces a deployment may
touch at all (shell, network egress, filesystem write, browser eval). A CapabilityManifest
and named TrustTier presets (locked / standard / trusted) are checked by a
constitutional rule before execution, giving a TrustTier x AgencyLevel x DataTier cube that
makes postures like "fully autonomous but think-only" first-class. Unconfigured deployments
grant everything (backward compatible); naming a tier fails closed.
See docs/CAPABILITIES.md.
Neo: learned routing. An optional learned head ranks the backends the policy already deemed eligible (the Fugu insight that model selection beats model quality), for one cheap scoring pass per request. The head proposes; the constitution still disposes, so Neo can never widen the allowed set. A reference offline trainer mines operative's own audit log into ranked examples and fits the head. See docs/NEO.md.
Speech. A backend family for voice: pluggable STT, TTS, and VAD adapters (local
faster-whisper / Kokoro / Silero are Tier-0-safe; OpenAI-compatible engines plug in too), a
realtime duplex SpeechSession with barge-in interruption, and an OpenAI-compatible serve
surface (operative serve --speech, exposing /v1/audio/transcriptions and
/v1/audio/speech). See docs/SPEECH.md.
Browser automation. A backend family for the web: a curated ~16-primitive
BrowserBackend protocol (navigate, scrape, search, screenshot, act, script) with surfboard
and Safari adapters, surfaced to an agent as governed browser_tools so navigation and
JavaScript execution pass through the capability and safety gates. See
docs/BROWSER.md.
Optimization and learning. Opt-in and inert when unset: an L0 response cache, per-tenant budgets with cheapest-first cascade, adaptive failure memory, and a bandit explore/exploit reorderer - all running inside the governance envelope. See docs/OPERATIVEoptimizations.md.
Observability and auth. OTLP traces, metrics, and logs without the heavy opentelemetry-sdk dependency; a Prometheus surface; usage and cost metering with anomaly detection. Auth is a pluggable authenticator with push step-up for sensitive operations (Tier-0 or over-budget), including an AuthGate adapter.
Multi-tenancy. Per-tenant workspaces, knowledge scoping, and budgets, plus a white-label
embeddable app (build_app) so one image serves many branded tenants from config and keys.
Command line
operative run --model <url> --workspace ./repo "task" # run a task to completion
operative serve --deployment deployment.yaml # the governed router HTTP server
operative serve --deployment deployment.yaml --speech # the OpenAI-compatible voice server
operative serve --config agent.yaml --api-keys-file k # the white-label tool-use agent
operative capabilities --deployment customer.yaml # audit what a config permits
operative capabilities --deployment new.yaml --diff old.yaml # capability/backend delta
Architecture
+---------------------------------------------------------------+
| request |
| -> authenticate credential to a tenant principal |
| -> route policy picks a backend chain |
| -> constitution Tier-0 / cost / latency, fail fast |
| -> execute run the chain with fallback |
| -> audit + metrics every decision logged and measured |
+---------------------------------------------------------------+
backends and knowledge stores are pluggable adapters
Documentation
Full index in docs/README.md. The guides:
- docs/SDK.md - the agent SDK: tools, MCP, structured output, hooks, agency
- docs/ROUTER.md - the multi-backend router, backend matrix, and constitution
- docs/CAPABILITIES.md - capability grants, trust tiers, the governance cube
- docs/NEO.md - learned routing and the offline head trainer
- docs/SPEECH.md - STT/TTS/VAD backends, the realtime SpeechSession, voice serve
- docs/BROWSER.md - the browser backend family and governed browser tools
- docs/OPERATIVEoptimizations.md - composing governance, optimization, and learning
- docs/ERROR-HANDLING.md - error semantics
- ARCHITECTURE.md - the request lifecycle and module map
- examples/ - runnable examples and deployment configs
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file operative_ai-0.1.0.tar.gz.
File metadata
- Download URL: operative_ai-0.1.0.tar.gz
- Upload date:
- Size: 184.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98ba6fa38e27fe23c79b4002b2cb2b689c1e61a4ea7d35dc279e88f94f14d228
|
|
| MD5 |
cf4954908c71a8396b8c0d575ac4cc49
|
|
| BLAKE2b-256 |
455e99a3636cf4ad49228b99811a240ea2fc589a7687073852b7b376a17ca316
|
File details
Details for the file operative_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: operative_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 234.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c04a7acfffe9f93a48d1ccd546867a0fe497f27fe9ef76b98e6670b5f8b7ae47
|
|
| MD5 |
23ed2eb28bb7e178bd646aa29290f7dc
|
|
| BLAKE2b-256 |
8da4074ae5f30c548eb49b1ba8b9b4143a489a21fdc170bd5cfe8a10be5b24ce
|