Deterministic state-machine guards for AI-agent workflows: enforce which tools an agent may call, in which order, with loop detection, cost budgets and step caps.

These details have not been verified by PyPI

Project links

Project description

statepilot

Deterministic state-machine guards for AI-agent workflows.

Define a state machine, then enforce — at runtime — which tools your agent may call, in which order, with loop detection, a cost budget, and a hard step cap. The agent only gets to do what the state machine allows. Anything else raises.

Zero runtime dependencies in the core. Fully typed. Python 3.10+.

pip install statepilot

The problem

A recurring theme in 2026 agent tooling is "wrap the non-deterministic LLM in deterministic code." A few data points that frame the gap:

Statewright (Rust + MCP) put deterministic state machines for agents on the map — it reached the Hacker News front page and lives at https://github.com/statewright/statewright. The framing clearly resonates.
llm-canary ships policy gates for agent traces — tool order, cost budgets, runaway-loop checks — but as a post-hoc test layer over recorded traces, not as runtime enforcement.
Orchestrators like LangGraph, CrewAI and the OpenAI Agents SDK are excellent at routing, but they don't hand you a small, hard rule that says "tool X is illegal in state Y, full stop."

The missing piece is a Python-native runtime guard: a thin layer you put in front of every tool call that enforces the allowed transitions and trips on loops and budget overruns. That is what statepilot is.

It does not orchestrate, plan, or call your LLM. It is the bouncer at the door.

Quickstart (Python builder)

from statepilot import StateMachine, Pilot

machine = (
    StateMachine.builder()
    .initial("research")
    .transition("research", "research", tool="search")      # looping allowed...
    .transition("research", "draft", tool="write_draft")
    .transition("draft", "review", tool="review")
    .transition("review", "draft", tool="revise")           # send it back
    .transition("review", "published", tool="publish")
    .terminal("published")
    .build()
)

pilot = Pilot(machine, budget=5.0, max_state_visits=4, max_steps=20)

pilot.step("search", cost=0.5)        # ok, still in "research"
pilot.step("write_draft", cost=1.0)   # -> "draft"
pilot.step("review")                  # -> "review"
pilot.step("publish")                 # -> "published" (terminal)

pilot.step("review")                  # raises TransitionError: terminal state

Every accepted step is recorded:

for record in pilot.history:
    print(record.index, record.source, "--", record.tool, "->", record.dest)

The `@guarded` decorator

Bind your actual tool functions to the pilot. The guard runs before the function body, so a violation means the body never executes.

from statepilot import StateMachine, Pilot, guarded, GuardViolation

machine = (
    StateMachine.builder()
    .initial("research")
    .transition("research", "research", tool="search")
    .transition("research", "draft", tool="write_draft")
    .terminal("draft")
    .build()
)
pilot = Pilot(machine, budget=5.0)

@guarded(pilot, cost=1.0)                 # tool name defaults to the function name
def search(query: str) -> list[str]:
    return real_search(query)

@guarded(pilot, tool="write_draft")       # or name it explicitly
def make_draft(notes: list[str]) -> str:
    return real_draft(notes)

search("agent guardrails")                # advances the machine, charges 1.0
make_draft(["..."])                       # -> "draft"

try:
    make_draft(["..."])                   # already terminal
except GuardViolation as exc:
    print("blocked:", exc)

YAML definition

Prefer config over code? Define the machine in YAML and load it. (YAML support is an optional extra: pip install statepilot[yaml].)

# pipeline.yaml
initial: research
terminal:
  - published
transitions:
  - {from: research, to: research, tool: search}
  - {from: research, to: draft,    tool: write_draft}
  - {from: draft,    to: review,   tool: review}
  - {from: review,   to: published, tool: publish}

from statepilot import StateMachine, Pilot

machine = StateMachine.from_yaml_file("pipeline.yaml")  # from a file
# or: StateMachine.from_yaml(yaml_string)               # from an inline string
pilot = Pilot(machine, budget=5.0)

states may be omitted — it is inferred from initial, terminal, and every state named in transitions. StateMachine.from_dict(...) accepts the same shape if you already have a dict.

A realistic agent example

"Research, then draft, then review, then publish. Never publish before review. Allow at most 3 research loops. Stop if cost exceeds $5."

from statepilot import StateMachine, Pilot, guarded, GuardViolation

machine = (
    StateMachine.builder()
    .initial("research")
    .transition("research", "research", tool="search")
    .transition("research", "draft", tool="write_draft")
    .transition("draft", "review", tool="review")
    .transition("review", "draft", tool="revise")
    .transition("review", "published", tool="publish")
    .terminal("published")
    .build()
)

# initial visit counts as 1, so max_state_visits=4 allows 3 extra research loops
pilot = Pilot(machine, budget=5.0, max_state_visits=4, max_steps=25)

@guarded(pilot, cost=0.8)
def search(q: str) -> str: ...

@guarded(pilot, cost=1.2)
def write_draft(notes: str) -> str: ...

@guarded(pilot)
def review(draft: str) -> bool: ...

@guarded(pilot, cost=0.3)
def publish(draft: str) -> str: ...

The agent loop calls these as it sees fit. statepilot makes the illegal paths impossible:

calling publish() while still in research -> TransitionError
a 4th search() loop -> LoopLimitExceeded
cumulative cost over $5 -> BudgetExceeded
more than 25 steps -> StepLimitExceeded

A runnable version is in examples/research_pipeline.py.

Why deterministic guards

LLMs are probabilistic. Most of the time the model follows the plan; occasionally it calls publish before review, gets stuck re-searching the same thing, or burns the budget. "Most of the time" is not a guarantee, and prompt-only constraints are suggestions, not enforcement.

A state machine turns those soft expectations into a hard contract that lives in code, runs on every tool call, and is trivial to unit-test. You get:

Safety — illegal tool sequences cannot happen; they raise instead.
Cost control — a real budget cap, enforced before the expensive call runs.
Loop protection — runaway repetition trips a clear, typed exception.
Auditability — pilot.history and pilot.to_trace() give you a complete, JSON-serialisable record of what the agent actually did.

It is intentionally small. The whole core is a StateMachine plus a Pilot, and the runtime cost is a dict lookup and a few integer comparisons per step.

API reference

`StateMachine`

Immutable, validated machine definition. Carries no runtime state.

StateMachine.builder(initial=None) -> StateMachineBuilder — fluent builder.
StateMachine.from_dict(data) -> StateMachine — build from a mapping.
StateMachine.from_yaml(text) -> StateMachine — build from an inline YAML string (needs the yaml extra).
StateMachine.from_yaml_file(path) -> StateMachine — build from a YAML file (needs the yaml extra).
.to_dict() — round-trips with from_dict.
.allowed_tools(state) -> tuple[str, ...]
.resolve(state, tool) -> str | None — destination, or None if disallowed.
.is_terminal(state) -> bool

`StateMachineBuilder`

.initial(state), .state(*states), .transition(src, dest, *, tool), .terminal(*states), .build(). Every mutator returns self.

`Pilot`

Stateful runtime enforcer. Construct with the machine and optional limits:

Pilot(
    machine,
    budget=None,              # cumulative cost cap
    max_steps=None,           # total steps cap
    max_state_visits=None,    # per-state visit cap (initial state counts as 1)
    max_consecutive_tool=None # same tool back-to-back cap
)

.step(tool, *, cost=0.0) -> str — validate + apply; returns the new state. Raises on violation; state is unchanged on failure.
.can(tool, *, cost=0.0) -> bool — pure check, never mutates, never raises.
.allowed_tools() -> tuple[str, ...], .state, .done, .steps_taken, .cost_spent, .history.
.to_trace() -> dict — JSON-serialisable run trace.
.reset() — back to the initial state, clears cost/counters/history.

`@guarded(pilot, *, tool=None, cost=0.0)`

Decorator that calls pilot.step(...) before the function body. tool defaults to the function name.

Exceptions

StatepilotError
├── StateMachineError          # invalid machine definition (definition-time)
└── GuardViolation             # runtime rule broken — catch this for "agent misbehaved"
    ├── TransitionError        # tool not allowed in the current state
    ├── LoopLimitExceeded      # state revisited / tool repeated too often
    ├── BudgetExceeded         # cumulative cost over budget
    └── StepLimitExceeded      # too many total steps

LangGraph adapter (experimental)

If you orchestrate with LangGraph, statepilot.adapters.guard_node wraps a node so the pilot guards it:

from statepilot import StateMachine, Pilot
from statepilot.adapters import guard_node
# from langgraph.graph import StateGraph

pilot = Pilot(machine, budget=5.0)
# graph = StateGraph(MyState)
# graph.add_node("research", guard_node(pilot, research_node, cost=1.0))
# graph.add_node("draft",    guard_node(pilot, draft_node))

It targets LangGraph's stable node contract (a callable state -> partial state dict) and never imports langgraph itself, so it adds no import-time dependency and does not break when the LangGraph API changes. It is deliberately minimal and marked experimental — conditional edges, Send fan-out, and checkpoint/resume are out of scope. For full control, just drive the Pilot inside your own node functions; that path is fully supported.

The adapter needs no extra dependency — it works with any callable. Install LangGraph in your own project if you use it.

Concurrency

A Pilot holds the mutable state of one agent run and is not thread-safe — use one pilot per run, don't share it across threads, and call pilot.reset() to reuse it. pilot.history is an immutable snapshot (a tuple), so reading or logging it can never desync the run's guards.

Status

Beta (0.1.0). The core API (StateMachine, Pilot, @guarded) is what we intend to keep stable. No benchmarks are claimed — the design goal is correctness and a tiny footprint, not throughput. Issues and PRs welcome.

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statepilot-0.1.0.tar.gz (23.1 kB view details)

Uploaded Jun 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

statepilot-0.1.0-py3-none-any.whl (18.6 kB view details)

Uploaded Jun 20, 2026 Python 3

File details

Details for the file statepilot-0.1.0.tar.gz.

File metadata

Download URL: statepilot-0.1.0.tar.gz
Upload date: Jun 20, 2026
Size: 23.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for statepilot-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fa643842cb0ba66890dd1cabd7477f1449cc27060ce4a9dc1fa46a0d7558a073`
MD5	`8ab2cc1608ac9799e9c1454dbcf547ce`
BLAKE2b-256	`bf21a62d991f28b76916e34c09bca6f975f2520270dad75a67ea2b33a7d57d28`

See more details on using hashes here.

File details

Details for the file statepilot-0.1.0-py3-none-any.whl.

File metadata

Download URL: statepilot-0.1.0-py3-none-any.whl
Upload date: Jun 20, 2026
Size: 18.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for statepilot-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d411fcdedfb91ed40dba4b1baa5411a3f3ee52c687534270cc881251c3f6fa66`
MD5	`1672b104ebe7391b9519443a0efb944f`
BLAKE2b-256	`fbd84406712d4c583e21f4fed11f45b716e3b81dc6ed4d592be91945771480b8`

See more details on using hashes here.

statepilot 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

statepilot

The problem

Quickstart (Python builder)

The @guarded decorator

YAML definition

A realistic agent example

Why deterministic guards

API reference

StateMachine

StateMachineBuilder

Pilot

@guarded(pilot, *, tool=None, cost=0.0)

Exceptions

LangGraph adapter (experimental)

Concurrency

Status

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

The `@guarded` decorator

`StateMachine`

`StateMachineBuilder`

`Pilot`

`@guarded(pilot, *, tool=None, cost=0.0)`