Async-aware runtime primitives for multi-step LLM agent loops.
Project description
techrevati-runtime
Production-grade runtime primitives for multi-step LLM agent loops — sync and async, with retry classification, circuit-breaker protection, per-model cost tracking, opt-in budget enforcement, role-based tool gating, content guardrails, agent-to-agent handoffs, declarative policy, and OpenTelemetry GenAI semantic conventions out of the box. Beta — 0.1.x; minor breaking changes possible until 0.2.0.
pip install techrevati-runtime
# Or with OpenTelemetry:
pip install 'techrevati-runtime[otel]'
Quick start
from techrevati.runtime import (
Orchestrator, UsageSnapshot, ModelPricing, register_pricing,
)
register_pricing("model-a", ModelPricing(input_per_million=3.0, output_per_million=15.0))
orch = Orchestrator(
role="writer", phase="draft", project_id=1,
budget_usd=10.0, enforce_budget=True, max_iterations=25,
)
with orch.session() as session:
result, usage = session.run_turn(
lambda: call_model(prompt),
model="model-a",
usage=UsageSnapshot(input_tokens=5000, output_tokens=1200),
timeout=30.0,
)
print(session.summary())
The session walks the worker through INITIALIZING → RUNNING → COMPLETED, classifies any exception that bubbles up into a typed failure scenario, attempts recovery once, enforces the budget, gates tool calls behind permissions and guardrails, and emits structured events to any sink you configure — without you wiring any of it by hand.
Async sibling: replace with with async with, session() with asession(), run_turn with arun_turn. Same parameters. asyncio.CancelledError cleanly transitions the worker to CANCELLED.
For an end-to-end example exercising every primitive (permissions + breaker + budget + guardrail + handoff + policy + OTel), see examples/tiny_agent.py and the end-to-end tutorial.
Design goals
- Zero runtime dependencies. Imports are stdlib only. OpenTelemetry is an optional
[otel]extra. - Type-safe.
py.typedmarker shipped; clean undermypy --strict. - Composable. Every primitive (
CircuitBreaker,AsyncCircuitBreaker,RetryContext,QualityGate,PolicyEngine,UsageTracker,PermissionEnforcer,Guardrail,Handoff) is usable standalone. TheOrchestratoris just the wiring. - Thread-safe and async-safe.
threading.Lockin sync paths,asyncio.Lockin async paths. State is per-instance. - Configuration-free at the edges. Pricing data is empty by default; phase thresholds are not hardcoded; permission roles are caller-defined. The runtime stays opinion-free about what your numbers mean.
Primitives
| Module | Provides |
|---|---|
orchestrator |
Orchestrator, OrchestrationSession, AsyncOrchestrationSession, AgentSession |
circuit_breaker |
CircuitBreaker, AsyncCircuitBreaker (CLOSED/OPEN/HALF_OPEN with configurable probe permits) |
retry_policy |
classify_exception, attempt_recovery (sync + async), backoff_delay with full/equal/decorrelated jitter |
usage_tracking |
UsageTracker, register_pricing, load_pricing_from_file, BudgetExceededError, has_pricing |
agent_lifecycle |
AgentRegistry, AgentWorker with validated state machine including CANCELLED |
agent_events |
Typed lifecycle events + OpenTelemetry attribute bridge |
permissions |
Role × tool authorization, deny-first |
guardrails |
Pre-call + post-call content gating around run_tool / arun_tool |
handoffs |
Handoff value + session.handoff_to() agent-to-agent delegation |
policy_engine |
Composable conditions and rule evaluator with auto-elapsed time |
sinks |
EventSink / UsageSink Protocols + ring-buffered defaults |
otel (optional) |
OpenTelemetrySink + OpenTelemetryUsageSink emitting GenAI semconv spans/metrics |
Showcase
Async with handoff and guardrails
import asyncio
from techrevati.runtime import (
AllowAllGuardrail, AsyncCircuitBreaker, Orchestrator, UsageSnapshot,
)
cb = AsyncCircuitBreaker("model-api", failure_threshold=3, recovery_timeout_seconds=30.0)
async def main():
orch = Orchestrator(
role="writer", phase="draft",
async_circuit_breaker=cb,
guardrails=[AllowAllGuardrail()],
max_iterations=10,
)
async with orch.asession() as session:
text, _ = await session.arun_turn(
lambda: acall_model(prompt),
model="model-a",
usage=UsageSnapshot(input_tokens=5000, output_tokens=1200),
timeout=30.0,
)
handoff = session.handoff_to("editor", reason="review", context={"draft": text})
print(f"handed off to {handoff.target_role}")
asyncio.run(main())
OpenTelemetry observability
from techrevati.runtime import Orchestrator
from techrevati.runtime.otel import OpenTelemetrySink, OpenTelemetryUsageSink
orch = Orchestrator(
role="writer", phase="draft",
event_sink=OpenTelemetrySink(agent_id="writer-001"),
usage_sink=OpenTelemetryUsageSink(),
)
# Every AgentEvent now appears as an OTel span with gen_ai.operation.name,
# gen_ai.agent.id, gen_ai.usage.{input,output}_tokens. Drop-in compatible
# with any APM ingest that already understands GenAI semconv.
See docs/api/otel.md for the full attribute list and span name mapping.
Standalone primitives
Pick just what you need. Each primitive is usable on its own without Orchestrator.
from techrevati.runtime import (
CircuitBreaker, CircuitOpenError,
UsageTracker, UsageSnapshot,
classify_exception, attempt_recovery, RecoveryContext,
)
cb = CircuitBreaker("downstream", failure_threshold=5, recovery_timeout_seconds=60.0)
result = cb.call(fetch, url, timeout=10) # raises CircuitOpenError if tripped
ctx = RecoveryContext()
scenario = classify_exception(my_error)
recovery = attempt_recovery(scenario, ctx) # returns RecoveryResult with steps to retry
tracker = UsageTracker()
tracker.record_turn("model-a", UsageSnapshot(input_tokens=5000, output_tokens=1200))
print(tracker.format_cost())
Why not LangGraph / OpenAI Agents SDK?
techrevati-runtime is intentionally smaller and narrower than either:
- LangGraph is a workflow engine with durable execution, checkpointer protocols, and a graph model. Use it when your agent flow is a graph that needs to survive restarts and you're OK with the LangChain ecosystem footprint.
- OpenAI Agents SDK is a cohesive runtime tied to OpenAI's models, with default tracing through their dashboards. Use it when you're committed to OpenAI and want the smoothest path.
techrevati-runtimeis a zero-dep primitive set. Sync + async. Vendor-neutral. Emits OpenTelemetry GenAI semantic conventions so the same APM dashboards that consume OpenAI Agents SDK telemetry will pick us up too. Bring your own model client and your own persistence — the runtime stays opinion-free.
The runtime is not a durable workflow engine. Sessions are in-memory; a pluggable checkpointer is on the 0.2.0 roadmap. If you need restart-resumable workflows today, pair this with Temporal, dbos, or LangGraph's checkpointer.
Limitations (be honest with yourself before adopting)
- Pricing must be registered. The bundled
pricing.jsonis intentionally empty. Withoutregister_pricing()orload_pricing_from_file(), every cost calculation returns $0.00 (you will see a one-time warning per model). - Budget enforcement is opt-in. Set
Orchestrator(enforce_budget=True)to raiseBudgetExceededError; the default merely records an event and continues. - Permissions are advisory.
OrchestrationSession.run_tool()enforces;run_turn()does not gate model calls. There is no sandbox — pair with OS-level isolation if needed. - No durable execution. Sessions are in-memory and ephemeral. Pair with Temporal/dbos for restart-resumable workflows.
- Default sinks are in-memory ring buffers. Long-running sessions need a durable
EventSinkandUsageSink(e.g.OpenTelemetrySink, or your own). CircuitBreakerstate is per-process. Each replica counts its own failures. Add a shared coordinator if you need fleet-wide breaker state.
Status
techrevati-runtime is at version 0.1.0 (beta). This release ships async-first execution, the four standard primitives (Sessions, Tools, Handoffs, Guardrails), max_iterations cap, and OpenTelemetry GenAI semantic conventions. Minor breaking changes are possible between 0.1.x and 0.2.0 — they will be documented in docs/migrating-from-0.0.x.md and gated by deprecation warnings. Pinning Python 3.11+ for from __future__ import annotations ergonomics and modern asyncio.
See CHANGELOG.md for the per-sprint release notes and docs/tutorials/end-to-end.md for a guided tour of every primitive.
Issues and PRs welcome — see CONTRIBUTING.md and SECURITY.md.
License
MIT — copyright © 2026 TechRevati doo. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file techrevati_runtime-0.2.0.tar.gz.
File metadata
- Download URL: techrevati_runtime-0.2.0.tar.gz
- Upload date:
- Size: 123.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e7df5da4ca3339ecc514d258fb3e2d83604f6cda17c7dd5b2c8b7ac041d2570
|
|
| MD5 |
4321fd0502f1e2a049ddcfdb2bca4400
|
|
| BLAKE2b-256 |
d82fb5f635e2ef15f2b8f1ae0d52de2dec8561a8cfe87ff21cbfd1df9314b390
|
Provenance
The following attestation bundles were made for techrevati_runtime-0.2.0.tar.gz:
Publisher:
release.yml on Techrevati/runtime
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
techrevati_runtime-0.2.0.tar.gz -
Subject digest:
8e7df5da4ca3339ecc514d258fb3e2d83604f6cda17c7dd5b2c8b7ac041d2570 - Sigstore transparency entry: 1586743608
- Sigstore integration time:
-
Permalink:
Techrevati/runtime@24c5ec4170580fa38f2ae43d829afb026730c2c9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Techrevati
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@24c5ec4170580fa38f2ae43d829afb026730c2c9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file techrevati_runtime-0.2.0-py3-none-any.whl.
File metadata
- Download URL: techrevati_runtime-0.2.0-py3-none-any.whl
- Upload date:
- Size: 60.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87d5e7aa8d83e9ec5beb0db44109db558b3af10d17248490b1662ee1243478b8
|
|
| MD5 |
ca843df04e645e5da9324646f12c8e95
|
|
| BLAKE2b-256 |
d7e634fb5ce09dca8c8922d861e59d0459819fa0f0af37a2ce8f9d41b6982966
|
Provenance
The following attestation bundles were made for techrevati_runtime-0.2.0-py3-none-any.whl:
Publisher:
release.yml on Techrevati/runtime
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
techrevati_runtime-0.2.0-py3-none-any.whl -
Subject digest:
87d5e7aa8d83e9ec5beb0db44109db558b3af10d17248490b1662ee1243478b8 - Sigstore transparency entry: 1586743698
- Sigstore integration time:
-
Permalink:
Techrevati/runtime@24c5ec4170580fa38f2ae43d829afb026730c2c9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Techrevati
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@24c5ec4170580fa38f2ae43d829afb026730c2c9 -
Trigger Event:
push
-
Statement type: