LLM calls that behave like the rest of your code

These details have not been verified by PyPI

Project links

Project description

Stratum

Stop babysitting your LLM calls.

Stratum is a Python library where @infer (LLM calls) and @compute (normal functions) compose identically. Typed contracts flow between steps. The runtime handles retry, budget enforcement, and observability — so you don't have to wire them up yourself.

@contract
class SentimentResult(BaseModel):
    label: Literal["positive", "negative", "neutral"]
    confidence: float
    reasoning: str

@infer(
    intent="Classify the emotional tone of customer feedback",
    ensure=lambda r: r.confidence > 0.7,
    budget=Budget(ms=500, usd=0.001),
    retries=3,
)
def classify_sentiment(text: str) -> SentimentResult: ...

If the LLM returns low confidence, it gets told exactly what failed and retries with that context — not a blank replay. If it hits the budget, it stops. Every call produces a structured trace record you can query.

Two Tracks

Track 1 — Python library (stratum): @infer, @contract, @flow decorators for building production LLM systems. Requires Python 3.11+, litellm, pydantic.

Track 2 — Claude Code MCP server (stratum-mcp): Stratum as an execution runtime for Claude Code. Claude writes .stratum.yaml specs, the MCP server enforces typed contracts and postconditions, Claude narrates progress in plain English. No sub-LLM calls — all execution stays within the Claude Code session.

Track 2: Claude Code + Stratum

pip install stratum-mcp
stratum-mcp setup

setup configures Claude Code in one command: writes .claude/mcp.json, appends the execution model block to CLAUDE.md, and installs seven skills to ~/.claude/skills/. Restart Claude Code and it's active.

Seven skills installed automatically:

Skill	What it structures
`/stratum-review`	Three-pass code review: security → logic → performance → consolidate
`/stratum-feature`	Feature build: read existing patterns → design → implement → tests pass
`/stratum-debug`	Debug: read test → read code → check env → form hypotheses → confirm/rule out → fix
`/stratum-refactor`	File split: analyze → design modules → plan extraction order → extract one at a time
`/stratum-migrate`	Find bare LLM calls and rewrite as `@infer` + `@contract` with typed contracts and postconditions
`/stratum-test`	Write a test suite for existing untested code — golden flows, error-path harness, passing on first report
`/stratum-learn`	Review recent session transcripts — extract retry patterns, write project-specific conclusions to `MEMORY.md`

Claude writes the .stratum.yaml spec internally — you never see it. You see plain English narration and the result. The MCP server enforces postconditions on every step; if a step's output fails a check, Claude fixes it and retries before reporting success.

Each skill reads project-specific patterns from MEMORY.md before writing its spec, and writes new patterns after stratum_audit — retry reasons, confirmed root causes, extraction order constraints. Run /stratum-learn periodically to extract conclusions from recent session transcripts and feed them back into future specs.

MCP tools exposed:

Tool	What it does
`stratum_validate`	Validate a `.stratum.yaml` spec offline
`stratum_plan`	Validate + create execution state + return first step
`stratum_step_done`	Report a completed step; check postconditions; return next step or completion
`stratum_audit`	Return per-step trace (attempts, duration) for any flow

Blog

Introducing Stratum: LLM Calls That Behave Like the Rest of Your Code The design rationale — why @infer and @compute share a type, how structured retry works, and what contracts actually buy you.

Stratum as a Claude Code Execution Runtime Claude Code is a capable agent improvising in a loop. This post is about giving it a formal execution model — typed plans, postcondition enforcement, auditable traces.

Building Software with Claude Code + Stratum: A Tutorial Real session transcripts: understanding a codebase, reviewing code, adding features, debugging CI failures, refactoring large files. Claude narrates in plain English throughout.

Why

LLM calls in production share a few recurring failure modes:

Retry is brute force. Most frameworks replay the full prompt on failure. Stratum injects only the specific postcondition that failed.
Budget is an afterthought. Soft hints don't stop a runaway refine loop. Stratum enforces hard limits — BudgetExceeded is an exception, not a bill.
Flows are opaque. When a multi-step pipeline fails, you want to know which step, with what input, after how many retries, at what cost. Stratum traces every call structurally.
LLM steps and regular functions don't compose. Stratum makes @infer and @compute indistinguishable by type — swap one for the other and nothing downstream changes.
Agent outputs can hijack downstream agents. opaque[T] fields are passed as structured data, never inlined into instruction text.
Human-in-the-loop is a custom build every time. await_human genuinely suspends execution and returns a typed HumanDecision[T].

Core Concepts

`@infer` and `@compute` are the same type

# Phase 1: LLM classifies tickets
@infer(intent="Route this support ticket", model="groq/llama-3.3-70b-versatile")
def route_ticket(text: str) -> TicketRoute: ...

# Phase 2: patterns emerged — swap to rules, zero other changes
@compute
async def route_ticket(text: str) -> TicketRoute:
    return TicketRoute(team=keyword_match(text), ...)

These have identical signatures. The @flow that calls route_ticket doesn't change. This means:

Testing: Replace @infer calls with @compute stubs for deterministic tests.
Migration: Start with LLM, replace with rules as patterns emerge. No downstream changes.
Cost control: Swap expensive inference for fast lookup when coverage allows.

Contracts are typed boundaries

@contract
class SentimentResult(BaseModel):
    label: Literal["positive", "negative", "neutral"]
    confidence: Annotated[float, Field(ge=0.0, le=1.0)]
    reasoning: str

A @contract class compiles to JSON Schema injected into the structured outputs API. The LLM's output is validated against it before your code sees it. Every contract carries a content hash — a hash change means the compiled prompt changed and LLM behavior may have drifted.

Retry is structured

On failure the LLM receives:

Previous attempt failed:
  - ensure condition 1 failed
Fix these issues specifically.

Not a full prompt replay. The specific violation, nothing else.

Flows are deterministic

@flow(budget=Budget(ms=5000, usd=0.01))
async def process_ticket(text: str) -> Resolution:
    sentiment = await classify_sentiment(text=text)
    response  = await draft_response(text=text, sentiment=sentiment)
    return response if rule_check(response) else escalate(text)

@flow is normal Python control flow. You can read it, test it, and trace it. The orchestration shape is known before any LLM call runs.

Features

Feature	Description
Structured retry	`ensure` postconditions drive retry with targeted failure feedback
Hard budget limits	Per-call and per-flow — `BudgetExceeded`, not a soft hint
`opaque[T]`	Field-level prompt injection protection
`await_human`	HITL as a first-class typed primitive — genuine suspension
`stratum.parallel`	Concurrent execution with `require: all/any/N/0` semantics
`quorum`	Run N times, require majority agreement
`stratum.debate`	Adversarial multi-agent synthesis with convergence detection
Full observability	Structured trace record on every call, OTLP export built-in
Two dependencies	`litellm` + `pydantic`. No OTel SDK.

Examples

Working examples in examples/:

File	What it shows
`01_sentiment.py`	`@infer` + `@contract` + `@flow` + `@compute` end-to-end
`02_migrate.py`	Migrating `@infer` → `@compute` without changing callers
`03_parallel.py`	Three concurrent `@infer` calls with `parallel(require="all")`
`04_refine.py`	`@refine` convergence loop — iterates until quality passes
`05_debate.py`	`debate()` — two agents argue, synthesizer resolves
`06_hitl.py`	`await_human` — human-in-the-loop approval gate

Install

Track 1 — Python library:

pip install stratum-py

Requires Python 3.11+. Set GROQ_API_KEY, ANTHROPIC_API_KEY, or any key LiteLLM supports, then specify it in model=.

Track 2 — Claude Code MCP server:

pip install stratum-mcp
stratum-mcp setup

Requires Claude Code. setup configures everything — restart Claude Code to activate.

Specification

SPEC.md is the normative specification covering the full type system, decorator signatures, execution loop, prompt compiler, concurrency semantics, HITL protocol, budget rules, trace record schema, and error types.

Status

Track 1 (Python library): implemented and tested.

Track 2 (stratum-mcp): MCP controller server implemented — stratum_plan, stratum_step_done, stratum_audit, stratum_validate. One-command setup with seven bundled skills and a memory system for project-specific pattern capture. 66 tests passing.

Questions and feedback: GitHub Discussions

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Feb 23, 2026

0.1.0

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stratum_py-0.1.1.tar.gz (105.5 kB view details)

Uploaded Feb 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stratum_py-0.1.1-py3-none-any.whl (31.7 kB view details)

Uploaded Feb 23, 2026 Python 3

File details

Details for the file stratum_py-0.1.1.tar.gz.

File metadata

Download URL: stratum_py-0.1.1.tar.gz
Upload date: Feb 23, 2026
Size: 105.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for stratum_py-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0d3e8d74e8f90417803b407df3d975eaa9796f2d401029e880fe016303bf0984`
MD5	`7c6dff7e6780b07b4acf4d083f00cfd5`
BLAKE2b-256	`3aa33ac694f89ffa1ac78d0394550f167618fb7915eb24e08d2a2e0a2673c518`

See more details on using hashes here.

File details

Details for the file stratum_py-0.1.1-py3-none-any.whl.

File metadata

Download URL: stratum_py-0.1.1-py3-none-any.whl
Upload date: Feb 23, 2026
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for stratum_py-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`67b23a3603bd8f8f4197387ec4df2afb7370491f6d6814e64bdd61a07d7139c8`
MD5	`d7cc9479c74bfdc8e4ea605a41943e29`
BLAKE2b-256	`a355e045bb3a8e8520a5b41b2044e89bd7aef743a6b0a7fc83e4666b71c2a9f5`

See more details on using hashes here.

stratum-py 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Stratum

Two Tracks

Track 2: Claude Code + Stratum

Blog

Why

Core Concepts

@infer and @compute are the same type

Contracts are typed boundaries

Retry is structured

Flows are deterministic

Features

Examples

Install

Specification

Status

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`@infer` and `@compute` are the same type