enzu

Budgeted LLM task orchestration with hard limits and long-context RLM.

These details have not been verified by PyPI

Project links

Project description

enzu

Budgeted LLM tasks that scale beyond context.

PyPI Python License

enzu is a Python-first toolkit for AI engineers and builders who need reliable, budgeted LLM runs. It enforces hard limits (tokens, time, cost), switches to RLM when context is large, and works across OpenAI-compatible providers. Use it from Python, the CLI, or the HTTP API.

30-second quickstart

uv add enzu
export OPENAI_API_KEY=sk-...
python -c "from enzu import ask; print(ask('Say hello in one sentence.'))"

What enzu is (and isn’t)

Enzu is a budget + reliability layer for LLM work: caps that actually stop execution when you hit token/time/cost limits.
Enzu isn’t a giant agent framework. It’s meant to stay small, composable, and easy to drop into existing code.

Why enzu

Hard budgets by default: tokens, time, and cost caps that actually stop work
RLM mode for long context: recursive subcalls when prompts are too large
Provider-agnostic: OpenAI-compatible APIs and bring-your-own model
Production-ready surfaces: Python SDK, CLI worker, and HTTP API

What enzu is / isn't

enzu is	enzu is not
A budget-first execution engine	A prompt library or template system
Hard stops when limits are hit	Best-effort throttling
RLM for tasks that exceed context	A vector DB or RAG framework
Provider-agnostic (OpenAI-compatible)	Tied to one vendor
Lightweight (~2k LOC core)	A full agent framework

Quickstart (Python)

uv add enzu
# or: pip install enzu

export OPENAI_API_KEY=sk-...

from enzu import Enzu, ask

print(ask("What is 2+2?"))

client = Enzu()  # Auto-detects from env
answer = client.run(
    "Summarize the key points",
    data="...long document...",
    tokens=400,
)
print(answer)

Tip: Set OPENAI_API_KEY, OPENROUTER_API_KEY, or another provider key. You can always pass model= and provider= explicitly.

Budget hard-stop (killer feature)

enzu enforces budgets as physics, not policy. When you set a limit, the system will stop:

from enzu import Enzu

client = Enzu()

# Ask for 500 words but cap at 50 tokens - enzu stops deterministically
result = client.run(
    "Write a 500-word essay on climate change.",
    data="...long research document...",
    tokens=50,  # Hard cap: output stops here
)
# Result: "[PARTIAL - budget exhausted]..." - work stopped, no runaway costs

See examples/budget_hardstop_demo.py for the full demo.

Typed outcomes (predictable handling)

Every run returns a typed Outcome for deterministic error handling:

from enzu import Enzu, Outcome

client = Enzu()
result = client.run("Analyze this", data=doc, tokens=100, return_report=True)

if result.outcome == Outcome.SUCCESS:
    print(result.answer)
elif result.outcome == Outcome.BUDGET_EXCEEDED:
    print(f"Partial result: {result.answer}" if result.partial else "Budget hit")
elif result.outcome == Outcome.TIMEOUT:
    handle_timeout()
# Also: PROVIDER_ERROR, TOOL_ERROR, VERIFICATION_FAILED, CANCELLED, INVALID_REQUEST

See examples/typed_outcomes_demo.py for the full demo.

RLM mode (reasoning over long context)

When your input exceeds context limits, enzu automatically switches to RLM (Reasoning Language Model) mode—recursive subcalls that break the problem into manageable pieces:

from enzu import Enzu

client = Enzu()

# Pass a large document - enzu auto-detects and uses RLM
answer = client.run(
    "Who is credited with the first algorithm?",
    data=open("large_research_paper.txt").read(),  # 100k+ tokens
    tokens=500,
)

RLM mode provides progress callbacks, step-by-step reasoning, and budget enforcement across all subcalls.

Use cases

1. Cost-controlled batch processing

# Process 1000 documents with a $10 budget cap
client = Enzu(cost=10.0)
for doc in documents:
    result = client.run("Extract key entities", data=doc)

2. Research assistant with guardrails

# Research task with time and token limits
answer = client.run(
    "Research recent AI safety papers and summarize",
    seconds=60,   # Max 1 minute
    tokens=1000,  # Max 1000 output tokens
)

3. Long document analysis

# Analyze a document too large for context window
summary = client.run(
    "Summarize the main arguments and conclusions",
    data=open("100_page_report.pdf.txt").read(),
    tokens=500,
)

Job mode (async delegation)

For long-running tasks, use job mode to submit and poll:

from enzu import Enzu, JobStatus
import time

client = Enzu()

# Submit a job (returns immediately)
job = client.submit("Analyze this large dataset", data=data, cost=5.0)
print(f"Job ID: {job.job_id}")

# Poll for completion
while job.status in (JobStatus.PENDING, JobStatus.RUNNING):
    time.sleep(1)
    job = client.status(job.job_id)

# Get result
if job.status == JobStatus.COMPLETED:
    print(job.answer)

# Or cancel if needed
# client.cancel(job.job_id)

See examples/job_delegation_demo.py for the full demo.

HTTP API (server)

uv pip install "enzu[server]"
uvicorn enzu.server:app --host 0.0.0.0 --port 8000

curl http://localhost:8000/v1/run \
  -H "Content-Type: application/json" \
  -d '{"task":"Say hello","model":"gpt-4o","provider":"openai"}'

If you set ENZU_API_KEY, pass X-API-Key on every request.

CLI worker

cat <<'JSON' | enzu
{
  "provider": "openai",
  "task": {
    "task_id": "hello-1",
    "input_text": "Say hello in one sentence.",
    "model": "gpt-4o"
  }
}
JSON

Docs

docs/README.md - Start here
docs/QUICKREF.md - Providers, env vars, model formats
docs/DEPLOYMENT_QUICKSTART.md - CLI + integration patterns
docs/SERVER.md - HTTP API
docs/PYTHON_API_REFERENCE.md - Full Python API
docs/COOKBOOK.md - Patterns and recipes
docs/BUDGETS_AS_PHYSICS.md - Essay: budgets, containment, typed outcomes for delegated agents
docs/RUN_METRICS.md - p95 cost/run and terminal state distributions

Examples

examples/budget_hardstop_demo.py - Killer demo: budget cap stops work deterministically
examples/typed_outcomes_demo.py - Typed outcomes for predictable error handling
examples/job_delegation_demo.py - Async job mode with polling
examples/python_quickstart.py - Minimal Python usage
examples/python_budget_guardrails.py - Hard budget limits
examples/budget_cap_total_tokens.py - Tiny total-token cap (hard stop)
examples/budget_cap_seconds.py - Tiny time cap (hard stop)
examples/budget_cap_cost_openrouter.py - Tiny cost cap (OpenRouter only)
examples/run_metrics_demo.py - p50/p95 cost/run and terminal state distributions
examples/retry_tracking_demo.py - Retry tracking and budget attribution
examples/rlm_with_context.py - RLM run over longer context
examples/chat_with_budget.py - TaskSpec + budgets + success criteria
examples/http_quickstart.sh - HTTP API run
examples/research_with_exa.py - Research tool + synthesis
examples/file_chatbot.py - File-based chat loop
examples/file_researcher.py - Session-based research loop

Contributing

See CONTRIBUTING.md.

Requirements

Python 3.9+

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Feb 1, 2026

0.2.0

Jan 28, 2026

0.1.1

Jan 19, 2026

0.1.0

Jun 25, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enzu-0.3.0.tar.gz (392.1 kB view details)

Uploaded Feb 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

enzu-0.3.0-py3-none-any.whl (291.7 kB view details)

Uploaded Feb 1, 2026 Python 3

File details

Details for the file enzu-0.3.0.tar.gz.

File metadata

Download URL: enzu-0.3.0.tar.gz
Upload date: Feb 1, 2026
Size: 392.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for enzu-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`cb6bb2a6ae3e135551a86086e6112a08535208de0290f3fc2dc45995fe1cf0ef`
MD5	`cc6bf92d281cfe59e04cd2a6f7de63b4`
BLAKE2b-256	`e71bb7fe3e9312dd2d99367a22463bdba7cd0abb52fe23ed590a7b65fcb45236`

See more details on using hashes here.

File details

Details for the file enzu-0.3.0-py3-none-any.whl.

File metadata

Download URL: enzu-0.3.0-py3-none-any.whl
Upload date: Feb 1, 2026
Size: 291.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for enzu-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`94e1ebcf1d63dd39b413b7971e5af5e27c6a80ff6b5fe9f2d280e47d7e1f3595`
MD5	`1a699497de65cf27ad76916fee65f1e3`
BLAKE2b-256	`ce8e5835cf0c1de4be078bbe3a16583d32a14d376836c6d16c6ee1a021de01ca`

See more details on using hashes here.

enzu 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

30-second quickstart

What enzu is (and isn’t)

Why enzu

What enzu is / isn't

Quickstart (Python)

Budget hard-stop (killer feature)

Typed outcomes (predictable handling)

RLM mode (reasoning over long context)

Use cases

Job mode (async delegation)

HTTP API (server)

CLI worker

Docs

Examples

Contributing

Requirements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes