TDD/OKR-driven agentic workflow framework with verifiable state deliverables
Project description
fsm-agent-flow
A TDD/OKR-driven workflow framework for LLM-powered applications. Each state declares an objective and key results that get validated before advancing — like running tests after writing code.
Why
Most LLM workflow frameworks either give you too little structure (raw prompt chains) or too much (rigid agent frameworks). fsm-agent-flow sits in the middle:
- States have acceptance criteria — key results are checked before moving on
- Failed states retry with feedback — the validator tells the LLM what went wrong
- The framework doesn't care what happens inside a state — call an LLM, run a script, bridge to CrewAI, or nest another workflow
- No global singletons — tools are scoped per state, contexts are explicit
- No heavy dependencies — zero required runtime deps, bring your own LLM client
Install
pip install fsm-agent-flow
# With LLM adapters
pip install fsm-agent-flow[openai]
pip install fsm-agent-flow[litellm]
pip install fsm-agent-flow[all]
Quick Start
from fsm_agent_flow import Workflow, StateSpec, KeyResult, ExecutionContext
from fsm_agent_flow.llm.openai import OpenAIAdapter
# Tools are just functions
def search(query: str) -> str:
"""Search the web."""
return f"Results for: {query}"
# States declare what they must accomplish
research = StateSpec(
name="research",
objective="Gather information on the topic",
key_results=[
KeyResult("has_content", "At least 200 chars", check=lambda o: len(str(o)) > 200),
KeyResult("has_sources", "Cites sources"), # LLM-validated (no check function)
],
execute=lambda ctx: ctx.llm.run_with_tools(
system_prompt="Research the topic using the search tool.",
user_message=ctx.input,
),
tools=[search],
max_retries=2,
is_initial=True,
)
writing = StateSpec(
name="writing",
objective="Write a structured report",
key_results=[
KeyResult("has_sections", "Has clear sections", check=lambda o: str(o).count("#") >= 2),
],
execute=lambda ctx: ctx.llm.run_with_tools(
system_prompt="Write a report from this research.",
user_message=str(ctx.input),
),
is_final=True,
)
# One call to run the whole workflow
llm = OpenAIAdapter(model="gpt-4o")
wf = Workflow(
objective="Research and report",
states=[research, writing],
transitions={"research": "writing"},
llm=llm,
validator_llm=llm,
)
result = wf.run("quantum computing")
Core Concepts
States with Objectives and Key Results
Every state has an objective (what it does) and key results (how we verify it succeeded):
StateSpec(
name="analyze",
objective="Analyze the dataset and identify trends",
key_results=[
# Programmatic check — runs as code
KeyResult("has_trends", "Identified at least 3 trends",
check=lambda o: len(o.get("trends", [])) >= 3),
# LLM-validated — no check function, validator LLM evaluates
KeyResult("actionable", "Insights are actionable with recommendations"),
],
execute=my_analyze_function,
max_retries=3,
)
The TDD Validation Loop
When a state executes, the framework:
- Calls
state.execute(ctx)to produce output - Runs all key result checks (programmatic first, then LLM)
- If any fail: retries with
ctx.feedbackexplaining what went wrong - If all pass: records the output and advances to the next state
- If retries exhausted: raises
MaxRetriesExceeded
Conditional & Bidirectional Transitions
Transitions aren't limited to simple linear flows. States can branch, loop back, and route conditionally — like a real finite state machine:
# Static (linear): always goes to the same next state
transitions = {"research": "writing", "writing": None}
# Conditional (branching / bidirectional): route based on output
transitions = {
"check_city": {"need_weather": "get_weather", "ready": "print_result"},
"get_weather": {"wrong_city": "get_weather", "default": "check_city"},
"print_result": None,
}
# Dynamic (callable): function decides next state
transitions = {
"decide": lambda output: "approve" if output.get("score") > 0.8 else "reject",
"approve": None,
"reject": None,
}
How conditional routing works: When a transition is a dict, the framework resolves the next state by checking the execute function's output:
- If output is a
dictwith a"_transition"key, its value selects the branch - If output is a
strmatching a key in the transition dict, use it - Otherwise, fall back to the
"default"key
def check_city(ctx: ExecutionContext):
weather = ctx.shared.get("weather")
if weather and weather["city"] == ctx.shared.get("target_city"):
return {"_transition": "ready", "report": weather}
return {"_transition": "need_weather"}
def get_weather(ctx: ExecutionContext):
city = ctx.shared.get("target_city")
data = fetch_weather_api(city)
if data["city"] != city:
return {"_transition": "wrong_city"} # Loop back to retry
ctx.shared.set("weather", data)
return {"_transition": "default"} # Return to check_city
This enables bidirectional flows (state A calls state B, B returns to A), retry loops, and decision branching — all without leaving the FSM model.
Tools Are Scoped Per State
No global registry. Each state declares its own tools:
research_state = StateSpec(
name="research",
tools=[search_web, fetch_paper], # Only available in this state
...
)
writing_state = StateSpec(
name="writing",
tools=[save_draft], # Different tools here
...
)
Tools are plain Python functions. The framework auto-generates JSON Schema signatures (OpenAI/Anthropic compatible) from type hints:
def search_web(query: str, max_results: int = 10) -> str:
"""Search the web for information."""
...
Shared Context
States share data through SharedContext (explicit key-value store, not a flat blob):
def step_one(ctx: ExecutionContext):
ctx.shared.set("findings", ["a", "b", "c"])
return "done"
def step_two(ctx: ExecutionContext):
findings = ctx.shared.get("findings", [])
return f"Processing {len(findings)} findings"
Execute Functions
A state's execute function receives an ExecutionContext with everything it needs:
def my_state(ctx: ExecutionContext):
ctx.input # Output from previous state
ctx.shared # SharedContext (read/write)
ctx.history # Previous states' outputs (read-only)
ctx.llm # BoundLLM with this state's tools
ctx.retry_count # Current retry attempt
ctx.feedback # Validator feedback from last failed attempt
Inside execute, you can do anything:
# Option A: Use the BoundLLM tool-calling loop
result = ctx.llm.run_with_tools(system_prompt="...", user_message="...")
# Option B: Call the LLM directly (no tool loop)
response = ctx.llm.chat([Message(role="user", content="...")])
# Option C: Bridge to an external agent framework
from crewai import Agent
result = Agent(...).run(ctx.input)
# Option D: Run arbitrary code
result = my_analysis_pipeline(ctx.input)
# Option E: Nest another workflow
inner_wf = Workflow(...)
result = inner_wf.run(ctx.input)
Built-in OODA Agent
For "LLM + tools" without wiring your own agent loop, use the built-in OODA agent:
from fsm_agent_flow import run_ooda
def investigate(ctx: ExecutionContext):
return run_ooda(ctx, task=f"Investigate: {ctx.input}",
tools=[search, analyze], max_cycles=3)
The OODA agent is itself a nested Workflow with 4 states (Observe, Orient, Decide, Act), dogfooding the framework.
Validators
Three options for validation:
# 1. RuleValidator (default) — only runs programmatic checks
from fsm_agent_flow import RuleValidator
wf = Workflow(..., validator=RuleValidator())
# 2. LLMValidator — runs checks + asks LLM for KRs without check functions
from fsm_agent_flow import LLMValidator
wf = Workflow(..., validator=LLMValidator(llm))
# 3. Shorthand — pass validator_llm to auto-create LLMValidator
wf = Workflow(..., validator_llm=cheap_llm)
# 4. Custom — implement the Validator protocol
class MyValidator:
def validate(self, state, output, context) -> ValidationResult:
...
LLM Adapters
The framework ships with OpenAI and LiteLLM adapters:
from fsm_agent_flow.llm.openai import OpenAIAdapter
from fsm_agent_flow.llm.litellm import LiteLLMAdapter
# OpenAI (or any OpenAI-compatible API)
llm = OpenAIAdapter(model="gpt-4o", api_key="sk-...")
llm = OpenAIAdapter(model="deepseek/deepseek-r1", base_url="https://openrouter.ai/api/v1")
# LiteLLM (any provider)
llm = LiteLLMAdapter(model="anthropic/claude-sonnet-4-20250514")
Build your own by implementing the LLMAdapter protocol — see docs/claude/rules/adapters.md or ask Claude Code.
Persistence
WorkflowContext is serializable for save/resume:
# Save
data = wf.context.to_dict()
json.dump(data, open("checkpoint.json", "w"))
# Resume
data = json.load(open("checkpoint.json"))
ctx = WorkflowContext.from_dict(data)
Examples
See examples/ for complete working examples:
research_workflow.py— Research + writing with tool calling and TDD validationooda_example.py— Using the built-in OODA agent inside workflow states
Claude Code Integration
This repo includes a CLAUDE.md and docs/claude/rules/ that teach Claude Code the framework's architecture. When you open this project in Claude Code, it automatically understands how to:
- Define workflows with states, transitions, and key results
- Build custom LLM adapters
- Write validation logic
- Use the OODA agent
- Debug common issues
Using with Claude Code in your own project
If you're using fsm-agent-flow as a dependency in your own project, add the following to your project's CLAUDE.md so Claude Code understands the framework:
# fsm-agent-flow
TDD/OKR-driven agentic workflow framework. See the reference docs:
@https://raw.githubusercontent.com/NewJerseyStyle/FSM-agent-flow/main/CLAUDE.md
@https://NewJerseyStyle.github.io/FSM-agent-flow/claude/rules/adapters.md
@https://NewJerseyStyle.github.io/FSM-agent-flow/claude/rules/workflows.md
@https://NewJerseyStyle.github.io/FSM-agent-flow/claude/rules/validation.md
@https://NewJerseyStyle.github.io/FSM-agent-flow/claude/rules/tools.md</pre>
This gives Claude Code full knowledge of the framework's API, patterns, and conventions when working on your codebase.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fsm_agent_flow-0.3.5.tar.gz.
File metadata
- Download URL: fsm_agent_flow-0.3.5.tar.gz
- Upload date:
- Size: 37.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c89f2b6484c0f083d192260c7938e0e8a93f6d34e808c8cf4767ca544cc6146
|
|
| MD5 |
60ab2b889c52fa35f90551efd6084e2e
|
|
| BLAKE2b-256 |
4a429a86d4ff835d4bf2ccdf3972f3faf8825bbe8725056a00d8a5150b18f730
|
Provenance
The following attestation bundles were made for fsm_agent_flow-0.3.5.tar.gz:
Publisher:
python-publish.yml on NewJerseyStyle/FSM-agent-flow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fsm_agent_flow-0.3.5.tar.gz -
Subject digest:
4c89f2b6484c0f083d192260c7938e0e8a93f6d34e808c8cf4767ca544cc6146 - Sigstore transparency entry: 999660421
- Sigstore integration time:
-
Permalink:
NewJerseyStyle/FSM-agent-flow@74a7f7dc6ff5dd09f2fa1f0204641315c28a3950 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/NewJerseyStyle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@74a7f7dc6ff5dd09f2fa1f0204641315c28a3950 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fsm_agent_flow-0.3.5-py3-none-any.whl.
File metadata
- Download URL: fsm_agent_flow-0.3.5-py3-none-any.whl
- Upload date:
- Size: 31.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15e4042f1c190f7b2523bd85af368dfccf6cc74effb989f5357b7daef3a34bad
|
|
| MD5 |
029bc531e2ea9984a4b4bdc4a3ce5bca
|
|
| BLAKE2b-256 |
79a9f067b84963d057a165997696c7cb075d3411d03da365dcace6ebf0da237e
|
Provenance
The following attestation bundles were made for fsm_agent_flow-0.3.5-py3-none-any.whl:
Publisher:
python-publish.yml on NewJerseyStyle/FSM-agent-flow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fsm_agent_flow-0.3.5-py3-none-any.whl -
Subject digest:
15e4042f1c190f7b2523bd85af368dfccf6cc74effb989f5357b7daef3a34bad - Sigstore transparency entry: 999660502
- Sigstore integration time:
-
Permalink:
NewJerseyStyle/FSM-agent-flow@74a7f7dc6ff5dd09f2fa1f0204641315c28a3950 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/NewJerseyStyle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@74a7f7dc6ff5dd09f2fa1f0204641315c28a3950 -
Trigger Event:
release
-
Statement type: