Skip to main content

Durable checkpoint/resume runner for async state-machine loops built on loom-tailcalls.

Project description

Loom Stack

loom-runner

PyPI Loom stack

Small durable checkpoint/resume runner for async state-machine loops built on top of loom-tailcalls and flow-xray. Full stack overview: kroq86.github.io/loom-stack

Official showcases: loom-run (dev chat + MCP) · loom-ops (ops runbooks + HITL) — both wire runner + flow-xray; pick by domain.

This is not a planner, memory system, graph DSL, hosted tracing product, or full agent SDK. It is the first slice of a Loom-based agent runtime: run a typed async transition loop, checkpoint each state transition, resume later, inspect history, and explain a run.

Loom stack

Overview: kroq86.github.io/loom-stack — packages, flow, audience, quick start.

The stack is a pyramid, not five equal frameworks. Tail-call optimization is the primitive, runner is the durable runtime, xray is the microscope, and the apps prove the stack in real workflows.

Layer Project Job
Primitive loom-tailcalls Make async recursive/state-machine loops stack-safe
Runtime kernel loom-runnerthis repo Make those loops durable, resumable, idempotent
Microscope flow-xray Show what actually happened in one offline HTML trace
Proof app loom-run Chat agent reference implementation
Proof app loom-ops Ops/runbook agent reference implementation
@tailrec agent loop  →  loom-runner run/resume  →  --trace trace.html
     (shape)                  (durability)              (flow-xray)

This repo is the runtime kernel. loom-runner is the library package and CLI for durable execution. loom-run is a runnable chat showcase built on it; the names are close, but the layer is different.

Dependency direction: loom-runner depends on loom-tailcalls and optionally emits flow-xray traces. loom-run and loom-ops depend on loom-runner; the kernel never depends on the apps.

Who it is for

  • Authors of long-running async agent loops who need checkpoint/resume without building their own store
  • Users of loom-tailcalls who want persistence and CLI inspection on top of stack-safe transitions
  • Users of flow-xray who want --trace trace.html from the runner CLI
  • Anyone who needs an inspectable run (explain, history, attempts, tool-calls) rather than a black box

Not for you if the agent is a single LLM call, or you already have LangGraph/Temporal (or similar) with persistence you are happy with.

This is not reasoning, planning, memory, or a path to AGI — it is a durability + observability primitive for state-machine-shaped agent runtimes.

Runtime transitions are logged as logical steps with attempt history. A retry does not create a new transition: for the same run_id, step_index, and stable input hash, the runner reuses the committed outcome. Transient errors are retryable by default; validation, business, permission, and unknown errors fail the run unless the caller supplies a different policy.

Tool side effects are only idempotent when invoked through RunContext.call_tool(...). Direct tool calls or external effects inside a transition are intentionally treated as unmanaged user code in this first runtime slice.

Long runs can use bounded reads and explicit storage policies. By default the runner keeps every checkpoint and every inline tool payload for maximum inspectability. For larger runs, use CheckpointPolicy(mode="interval", every=N) to retain only periodic history checkpoints while preserving the current resumable state, and PayloadPolicy(max_inline_bytes=N) to replace large managed tool payloads with hash/size metadata.

The import package remains loom_agent; the distribution and CLI are named loom-runner because loom-agent is already occupied by an unrelated package on PyPI.

Install

python3.13 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Minimal Shape

from dataclasses import dataclass

from loom_agent import AgentRunner, Complete, Continue, RunContext, SQLiteCheckpointStore


@dataclass(frozen=True)
class State:
    current: int
    target: int


async def step(state: State, ctx: RunContext):
    if state.current >= state.target:
        return Complete({"current": state.current})
    return Continue(State(current=state.current + 1, target=state.target))


runner = AgentRunner(
    step=step,
    store=SQLiteCheckpointStore("runs.sqlite"),
    encode_state=lambda state: {"current": state.current, "target": state.target},
    decode_state=lambda data: State(**data),
    encode_result=lambda result: result,
    decode_result=lambda data: data,
)

Example

loom-runner run examples/counter_agent.py --run-id demo --db runs.sqlite --max-steps 5
loom-runner resume examples/counter_agent.py --run-id demo --db runs.sqlite --max-steps 100
loom-runner list examples/counter_agent.py --db runs.sqlite
loom-runner get examples/counter_agent.py --run-id demo --db runs.sqlite
loom-runner history examples/counter_agent.py --run-id demo --db runs.sqlite
loom-runner attempts examples/counter_agent.py --run-id demo --db runs.sqlite --limit 20
loom-runner tool-calls examples/counter_agent.py --run-id demo --db runs.sqlite --limit 20
loom-runner explain examples/counter_agent.py --run-id demo --db runs.sqlite

Add --trace trace.html to either command to emit a local flow-xray HTML trace. The runner traces step leaves and keeps the tail-recursive driver as the durable loop boundary.

Or directly:

python3.13 examples/counter_agent.py

Tests

python3.13 -m pytest

Runtime Benchmark

python3.13 scripts/bench_runtime.py --steps 100000
python3.13 scripts/bench_runtime.py --steps 100000 --checkpoint-every 100

The benchmark reports wall time, retained checkpoint rows, attempt rows, DB size, and peak Python memory. It is a local regression tool, not a hosted-scale performance claim.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

loom_runner-0.1.2.tar.gz (24.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

loom_runner-0.1.2-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file loom_runner-0.1.2.tar.gz.

File metadata

  • Download URL: loom_runner-0.1.2.tar.gz
  • Upload date:
  • Size: 24.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for loom_runner-0.1.2.tar.gz
Algorithm Hash digest
SHA256 38772f177d1c656fe01c72b1a82853f9c73b47ae5594f8552b7ad6bdc8147a4b
MD5 5ec65ca560af39d8c529c35d9d8639bb
BLAKE2b-256 5c397dfb96eb419d841dd609558b7e75373cd17cb8665a64d5074b74f9bad93e

See more details on using hashes here.

File details

Details for the file loom_runner-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: loom_runner-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for loom_runner-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 75ca53d5ac4644e91b7124b4fdde4bf66e5dd688aa0278a87a0289a779867424
MD5 616dbd82399822ac2362380c54f17031
BLAKE2b-256 a3920845c5863bb571d3afce8949052b11e37da80ae0e7e197da1121ce38dee4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page