Skip to main content

A safety-gated, streaming, tool-using agent SDK with hooks, subagents, structured output, multi-turn sessions, and MCP.

Project description

operative

An AI-agnostic agent SDK and governed multi-backend router. By AXE Technologies.

One binary serves any stack, with Tier-0 data isolation, a full audit trail, and cost governance built in.

The core imports no vendor SDK and hardcodes no model. Backends are pluggable adapters behind protocols; a deployment YAML wires the stack. The same binary runs self-hosted in your own perimeter or as a managed service.

Why operative

  • AI-agnostic. The core is a clean protocol layer. Vendors (Anthropic, OpenAI, Ollama, MLX, vLLM, Qdrant, Postgres, Pinecone, Weaviate) live only inside adapters, imported lazily. Add Groq or Mistral with one adapter and a YAML line.
  • Governed by construction. Tier-0 data never leaves local hardware: a constitution enforces it before any backend runs, and fails fast and safe if a request would violate it. Every decision is replayable in an audit trail.
  • A spectrum of agency. Dial autonomy from a single deterministic call up to a fully autonomous agent with one named level. The catastrophic safety floor and the constitution hold identically at every level.
  • Cost governance. Per-tenant budgets, cheap-first cascade execution, an L0 response cache, and cost-anomaly detection.

Install

pip install operative-ai       # the import name stays `import operative`
pip install -e .               # from source

Quickstart: the SDK

from operative import Agent, tool, HTTPAgentModel

@tool
def add(a: int, b: int) -> int:
    "Add two integers."
    return a + b

agent = Agent(HTTPAgentModel("http://localhost:8095", "qwen"), tools=[add], workspace="./repo")
result = agent.run_sync("add 2 and 3, then write it to sum.txt")
print(result.answer)

Tool use with a safety floor, streaming, cost metering, self-correction, lifecycle hooks, subagent delegation, structured output, and MCP (stdio and HTTP) are wired in by default. See docs/SDK.md.

The router: serve any stack from a YAML

operative serve --deployment deployment.yaml

A deployment names its backends, routing policy, and governance. The same binary serves an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop running Ollama. The HTTP surface routes and governs every request:

POST /v1/route            route and execute one inference (tier-aware, governed)
GET  /v1/explain/{id}     replay the full decision trail for a request
GET  /v1/stats            per-backend latency and cost
GET  /healthz /info       liveness and backends
from operative import RoutedAgent, load_deployment, make_resolver

agent = RoutedAgent.from_deployment(load_deployment("deployment.yaml"), make_resolver())
result = await agent.run(request)        # routed, governed, audited

See docs/ROUTER.md and docs/OPERATIVEoptimizations.md.

What's in the box

Every innovation, with a short read and a link to its deep-dive.

AI-agnostic agent SDK. Tool use with a safety floor, streaming, self-correction, lifecycle hooks, structured output, multi-turn conversations with token-budget compaction, subagent delegation, MCP over both stdio and HTTP, and a human-in-the-loop approval channel. One import (from operative import Agent, tool) is the whole adoption surface. See docs/SDK.md.

Governed multi-backend router. operative serve --deployment deployment.yaml turns a single YAML into a routing HTTP server. The same binary serves an internal MLX-plus-cloud stack, a customer on OpenAI plus Pinecone, or a laptop on Ollama. Inference backends: Anthropic, OpenAI (and any OpenAI-compatible server: vLLM, LM Studio, Together, Groq), Ollama, MLX. Knowledge backends: Qdrant, Postgres, Pinecone, Weaviate, plus two embedders. All are lazy-imported adapters behind a registry. See docs/ROUTER.md.

Governance by construction. A constitution checks every request before any backend runs and fails fast: Tier-0 data isolation (sensitive data never leaves owned hardware), plus cost and latency ceilings. Below it sits the IronGate catastrophic floor, which denies the truly dangerous (rm -rf /) in every mode. Permission modes (auto / ask / acceptEdits / plan) and a spectrum of named agency levels dial autonomy from a single deterministic call to a fully autonomous loop, with the floor and constitution holding identically at each level. Every decision is replayable via explain(). See docs/ROUTER.md and docs/SDK.md.

Capabilities: the third governance axis. Where agency governs how many human checkpoints and data tier governs classification, capabilities govern which surfaces a deployment may touch at all (shell, network egress, filesystem write, browser eval). A CapabilityManifest and named TrustTier presets (locked / standard / trusted) are checked by a constitutional rule before execution, giving a TrustTier x AgencyLevel x DataTier cube that makes postures like "fully autonomous but think-only" first-class. Unconfigured deployments grant everything (backward compatible); naming a tier fails closed. See docs/CAPABILITIES.md.

Neo: learned routing. An optional learned head ranks the backends the policy already deemed eligible (the Fugu insight that model selection beats model quality), for one cheap scoring pass per request. The head proposes; the constitution still disposes, so Neo can never widen the allowed set. A reference offline trainer mines operative's own audit log into ranked examples and fits the head. See docs/NEO.md.

Speech. A backend family for voice: pluggable STT, TTS, and VAD adapters (local faster-whisper / Kokoro / Silero are Tier-0-safe; OpenAI-compatible engines plug in too), a realtime duplex SpeechSession with barge-in interruption, and an OpenAI-compatible serve surface (operative serve --speech, exposing /v1/audio/transcriptions and /v1/audio/speech). See docs/SPEECH.md.

Browser automation. A backend family for the web: a curated ~16-primitive BrowserBackend protocol (navigate, scrape, search, screenshot, act, script) with surfboard and Safari adapters, surfaced to an agent as governed browser_tools so navigation and JavaScript execution pass through the capability and safety gates. See docs/BROWSER.md.

Optimization and learning. Opt-in and inert when unset: an L0 response cache, per-tenant budgets with cheapest-first cascade, adaptive failure memory, and a bandit explore/exploit reorderer - all running inside the governance envelope. See docs/OPERATIVEoptimizations.md.

Observability and auth. OTLP traces, metrics, and logs without the heavy opentelemetry-sdk dependency; a Prometheus surface; usage and cost metering with anomaly detection. Auth is a pluggable authenticator with push step-up for sensitive operations (Tier-0 or over-budget), including an AuthGate adapter.

Multi-tenancy. Per-tenant workspaces, knowledge scoping, and budgets, plus a white-label embeddable app (build_app) so one image serves many branded tenants from config and keys.

Command line

operative run --model <url> --workspace ./repo "task"   # run a task to completion
operative serve --deployment deployment.yaml            # the governed router HTTP server
operative serve --deployment deployment.yaml --speech   # the OpenAI-compatible voice server
operative serve --config agent.yaml --api-keys-file k   # the white-label tool-use agent
operative capabilities --deployment customer.yaml       # audit what a config permits
operative capabilities --deployment new.yaml --diff old.yaml   # capability/backend delta

Architecture

+---------------------------------------------------------------+
|  request                                                      |
|    -> authenticate        credential to a tenant principal    |
|    -> route               policy picks a backend chain        |
|    -> constitution        Tier-0 / cost / latency, fail fast  |
|    -> execute             run the chain with fallback         |
|    -> audit + metrics     every decision logged and measured  |
+---------------------------------------------------------------+
   backends and knowledge stores are pluggable adapters

Documentation

Full index in docs/README.md. The guides:

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

operative_ai-0.1.0.tar.gz (184.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

operative_ai-0.1.0-py3-none-any.whl (234.4 kB view details)

Uploaded Python 3

File details

Details for the file operative_ai-0.1.0.tar.gz.

File metadata

  • Download URL: operative_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 184.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for operative_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 98ba6fa38e27fe23c79b4002b2cb2b689c1e61a4ea7d35dc279e88f94f14d228
MD5 cf4954908c71a8396b8c0d575ac4cc49
BLAKE2b-256 455e99a3636cf4ad49228b99811a240ea2fc589a7687073852b7b376a17ca316

See more details on using hashes here.

File details

Details for the file operative_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: operative_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 234.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for operative_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c04a7acfffe9f93a48d1ccd546867a0fe497f27fe9ef76b98e6670b5f8b7ae47
MD5 23ed2eb28bb7e178bd646aa29290f7dc
BLAKE2b-256 8da4074ae5f30c548eb49b1ba8b9b4143a489a21fdc170bd5cfe8a10be5b24ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page