Skip to main content

Build Recursive Language Models as inspectable execution graphs.

Project description

recursive-flow

PyPI Docker

A Python library for building Recursive agent graphs built off of Recursive Language Models.

As LLMs get better at coding, strict agent harnesses become less important. RLMs let the model decide how to view and manipulate context, when to delegate pieces of it to sub-agents, and how to combine the results, all through the same clean coding interface.

recursive-flow turns that recursive run into a live execution graph. Every query, action, observation, child call, wait, resume, and result is a typed node you can inspect, step, retrace, save, load, fork, and branch into new run directories. It is for people building long-context agents, recursive coding agents, and research loops where the execution trace needs to be as controllable as the final answer is useful. Each start / step returns a fresh Graph snapshot: a recursive structure where graph[agent_id] returns the sub-agent graph for that agent.

recursive-flow animation

RLMs as Graphs

RLMs delegate subtasks to children, those children can delegate to their own children, and results bubble back up. recursive-flow represents the whole run as one recursive type:

  • Graph — one agent snapshot. Carries the agent's run-invariants flat on itself (agent_id, depth, query, system_prompt, config, runtime, model, parent_agent_id, parent_node_id), plus its nodes trajectory and a children: dict[str, Graph] of sub-agents. Cross-agent navigation is graph[other_aid]; subtree views are graph.agents, graph.all_nodes, graph.edges.
  • Node — one immutable node in an agent's trajectory. The trajectory is a strict alternation of observations (inputs the system received) and actions (work the system did). Nine leaf types live under four base classes — see docs/node_model.md:
    • Observations: UserQuery, LLMOutput, ExecOutput, SupervisingOutput, ErrorOutput, DoneOutput.
    • Actions: LLMAction, ExecAction, ResumeAction.

The agent has one delegation call: await launch_subagents([...]). It always takes a list of dict specs and always returns child answers as a list[str] in the same order. A one-child delegation is just a one-item list. An agent that delegates two children and combines their results writes one REPL block like this:

results = await launch_subagents([
    {"name": "search", "query": "Find evidence", "inputs": {"chunk": chunk_a}},
    {"name": "verify", "query": "Check the answer", "inputs": {"chunk": chunk_b}},
])
done(combine(results))

The await is the supervision point: it suspends the parent at a single WaitRequest, the engine runs the children on its pool, then resumes the parent with their results. The REPL supports top-level await and the engine drives the resulting coroutine, roughly:

out = coro.send(None)              # run until the await
# out is a WaitRequest([search, verify]) -> suspend the parent, run children
results = [c.result() for c in children]
coro.send(results)                 # resume; `results` is now the list

The REPL is stateful across blocks, so the next LLM turn can still see results. The launcher must be awaited; a bare call or a top-level yield are errors. Agents should use launch_subagents(...) for delegation.

See docs/internals.md for the full protocol.

The block above becomes this execution graph (one obs/action pair per step):

UserQuery(root)
  -> LLMAction -> LLMOutput(code="await launch_subagents([search, verify])")
  -> ExecAction -> SupervisingOutput(waiting_on=[root.search, root.verify])
      -> UserQuery(root.search)  -> ... -> DoneOutput(root.search)
      -> UserQuery(root.verify)  -> ... -> DoneOutput(root.verify)
  -> ResumeAction -> ExecOutput(resumed_from=[root.search, root.verify])
  -> LLMAction -> LLMOutput(code="done(combine(...))")
  -> ExecAction -> DoneOutput(root)

Install

pip install recursive-flow               # core
pip install recursive-flow[openai]       # + OpenAI client
pip install recursive-flow[anthropic]    # + Anthropic client
pip install recursive-flow[tinker]       # + Tinker inference client
pip install recursive-flow[dspy]         # + DSPy adapter
pip install recursive-flow[sandbox]      # + Modal, E2B, and Daytona runtimes
pip install recursive-flow[viewer]       # + Gradio viewer (plotly)
pip install recursive-flow[image]        # + static image / GIF export (kaleido)
pip install recursive-flow[all]          # all of the above

From source:

git clone https://github.com/shyamsn97/recursive-flow && cd recursive-flow
pip install -e .

For local development, make install runs cleanup, formatting/lint checks including ruff check ., then installs the package.

Security warning — LocalRuntime is not a sandbox. Agent code runs as full Python in your process: filesystem, network, environment variables, subprocesses — the same privileges as your interpreter. LLM-generated code can be wrong or malicious (prompt injection, model errors, supply-chain risk). Use LocalRuntime only for code you would run yourself. For untrusted agents or anything exposed to the internet, use DockerRuntime or a remote sandbox (ModalRuntime / E2BRuntime / DaytonaRuntime). See docs/security.md. Use at your own risk.

Quick start

This example builds a simple coding agent with file tools in a local working directory. See examples/notebooks/coding_agent.ipynb for the notebook version.

from pathlib import Path

import rflow
from rflow.tools import FILE_TOOLS
from rflow.utils.viewer import open_viewer

workdir = Path("examples/_runs/quickstart")
runtime = rflow.LocalRuntime(working_directory=workdir)
runtime.register_tools(FILE_TOOLS)

# Sandbox agent code inside Docker instead: drop-in replacement, same interface.
# Build the image once with `docker build -t recursive-flow:local .`.
# runtime = rflow.DockerRuntime(
#     "recursive-flow:local",
#     working_directory=workdir,
#     mounts={workdir: "/workspace"},
#     workdir="/workspace",
# )
# runtime.register_tools(FILE_TOOLS)

agent = rflow.Flow(
    rflow.OpenAIClient(model="gpt-5"),
    runtime=runtime,
    max_depth=2,
    max_iters=20,
    child_max_iters=20,
    llm_clients={"fast": rflow.OpenAIClient(model="gpt-5-mini")},
)

query = "Build a Python text-based adventure game with combat and inventory."
graph = agent.start(query)
while not graph.finished:
    graph = agent.step(graph)
    print(graph.tree())

print(graph.result())
graph.save(workdir / "graph")
open_viewer(workdir / "graph")

Flow is configured directly: max_depth, max_iters, child_max_iters, max_concurrency, llm_max_concurrency, max_budget, max_messages, and eager_children are constructor kwargs. Normal agent LLM turns and llm_query_batched(...) share the same LLMChannel, so concurrency and token usage accounting are centralized.

To let child agents drain work-conservingly after a parent reaches its delegation wait (await launch_subagents([...])), enable eager_children:

agent = rflow.Flow(
    rflow.OpenAIClient(model="gpt-5"),
    runtime=runtime,
    max_depth=2,
    child_max_iters=20,
    max_concurrency=8,
    llm_max_concurrency=4,
    eager_children=True,
)

With eager_children=False, a fast child that finishes task_1 waits for the rest of that parallel step before it can start task_2. With eager_children=True, the fast child's task_2 can start while a slow sibling is still running task_1. See examples/control/delegation/eager_children.py for a deterministic timestamped demo.

A saved run is a directory rooted at graph.json plus agents/ logs. Reopen it later with Graph.load(path) or open_viewer(path).

Drop-in LLMClient

Flow implements LLMClient, so it is a drop-in replacement for any raw LLM.

def ask(llm: rflow.LLMClient, q: str) -> str:
    return llm.chat([{"role": "user", "content": q}])

ask(rflow.OpenAIClient(model="gpt-4o-mini"), "2+2?")  # one LLM call
ask(rflow.Flow(rflow.OpenAIClient(model="gpt-4o-mini")), "2+2?")  # full agent loop

Nest agents by passing one Flow as another's llm. See examples/drop_in_llm.py.

Prompt sections and skills

The default system prompt is built from named sections. Sections can be static text or callables with the signature section(flow, graph) -> str. That makes project memory or skills ordinary files plus a prompt section that decides when to include them:

from pathlib import Path

from rflow.prompts import DEFAULT_BUILDER

skill_path = Path("skills/numpy-linear-algebra/SKILL.md")

def skills_section(flow, graph):
    if not skill_path.exists():
        return ""
    return skill_path.read_text()

flow = rflow.Flow(rflow.OpenAIClient(model="gpt-4o-mini"))
flow.prompt_builder = DEFAULT_BUILDER.section(
    "skills",
    skills_section,
    title="Skills",
    before="tools",
)

See examples/skills.py for a runnable version.

Step and inspect

step(graph) -> graph' is one atomic graph transition. Every step returns a fresh Graph snapshot, so the live tree is just graph.tree():

graph = agent.start(query)
while not graph.finished:
    graph = agent.step(graph)
print(graph.tree())
root [supervising] {default}
├── root.scanner_auth [result] {fast} -> Found SQL injection in login.py
├── root.scanner_api  [supervising] {default}
│   ├── root.scanner_api.chunk_0 [result] {fast} -> Clean
│   └── root.scanner_api.chunk_1 [result] {fast} -> Payment flow is safe
└── root.scanner_db   [result] {fast} -> No issues found

Every transition follows the same obs → action → obs shape:

LLMOutput  -> ExecAction -> ExecOutput          (REPL output, normal continuation)
                         -> DoneOutput          (code called done())
                         -> ErrorOutput         (code raised / no code block)
                         -> SupervisingOutput   (awaited a launcher — waiting on children)
SupervisingOutput -> ResumeAction -> ExecOutput / Done / Error / Supervising
                                                (children settled — supervisor unpaused)
ExecOutput -> LLMAction -> LLMOutput            (back to the LLM for the next turn)

Action nodes carry the work the engine did; observation nodes carry what was returned. Every action is followed by exactly one observation. The graph is queryable in plain Python:

graph.tree()                                  # ASCII render
graph["root.scanner_api"]                     # sub-Graph rooted at that agent
graph.agents["root.scanner_api"].nodes       # node trajectory for one agent
graph.children                                # dict[str, Graph] for child agents
graph.all_nodes.find("n_abc...")                  # bare Node lookup by id
graph.all_nodes.errors()                          # every ErrorOutput across agents
graph.all_nodes.results()                         # every DoneOutput across agents
graph.all_nodes.supervising()                     # every SupervisingOutput across agents
graph.all_nodes.where(type="llm_output", agent_id="root")  # kwargs match Node attrs
graph.all_nodes.where(lambda n: n.type == "exec_output")    # or pass a predicate
graph.to_dict()                               # full JSON-serializable payload

Inject controller events

Because Graph is the control surface, external controllers can append typed events and commit them through the normal step loop. This is useful for human feedback, budget nudges, and forced finalization without losing traceability:

import rflow

graph = graph.inject(
    target="root.scanner_api",
    node=rflow.ExecOutput(
        output="Injected controller observation: answer with current evidence.",
        content="Injected controller observation: answer with current evidence.",
    ),
)
graph = agent.step(graph)  # persists the observation, then continues

graph = graph.inject(
    target="root.scanner_api",
    node=ExecAction(code='done("best available answer")'),
)
graph = agent.step(graph)  # executes the action and writes DoneOutput

Injected nodes become ordinary graph nodes with the same shape as organic nodes. See docs/injections.md and examples/control/controller_injection.py.

Save, Load, Rewind, Branch

Graph is the durable run object. Save a run directory with graph.save(...), reopen it with Graph.load(...), and keep step snapshots when you want rewind or live checkpointing:

history = [agent.start(query)]
while not history[-1].finished:
    history.append(agent.step(history[-1]))
    history[-1].save("runs/deep_research")  # overwrites the latest checkpoint

latest = rflow.Graph.load("runs/deep_research")

Branch by copying or loading a saved graph and continuing it with a Flow:

branch = latest.copy(deep=True)
while not branch.finished:
    branch = agent.step(branch)
branch.save("runs/deep_research_repair")

Controller edits use the same graph surface (replace_node, truncate_after, inject, retrace_steps) and then continue through agent.step(graph). See examples/showcase.py, docs/control.md, and docs/injections.md.

Rich visualization

See notebook for a full showcase of vizualization utilities.

Because the run is a typed graph, every visualization is just a render of that graph. View either a saved run directory, a single Graph, or a list of step snapshots.

Gradio viewer

open_viewer(source) launches a small browser app for inspecting a saved run directory, a graph snapshot, or an in-memory trace:

from rflow.utils.viewer import open_viewer

open_viewer("runs/deep_research")

From the CLI: recursive-flow view runs/deep_research --port 7861.

Live terminal tree

rflow.utils.viz.live(agent, graph) drives the step loop and renders a Rich tree as nodes are produced. The boids run (Create a simple boids simulation in plain HTML and JavaScript, split each component into separate files) settles to:

root [result] {default:gpt-5} -> Boids simulation written to output/boids-simulation with modular JS (boid, simulation, renderer) and index.html entrypoint.
  root.index_html    [result] {fast:gpt-5-mini} -> ok
  root.styles_css    [result] {fast:gpt-5-mini} -> ok
  root.boid_js       [result] {fast:gpt-5-mini} -> ok
  root.simulation_js [result] {fast:gpt-5-mini} -> ok
  root.renderer_js   [result] {fast:gpt-5-mini} -> ok
  root.main_js       [result] {fast:gpt-5-mini} -> ok

The same render is available offline as graph.tree() on any snapshot. Filename-flavored agent ids (index.htmlindex_html) are sanitized because . is the parent/child delimiter in the agent tree.

Static renders

recursive-flow render <path> -f F writes a static visualization in any of:

mermaid             # stateDiagram-v2 (default topology)
mermaid-flowchart   # flowchart TD, better for wide trees
mermaid-sequence    # sequenceDiagram of delegate / wait / resume
dot · d2            # Graphviz / D2 topology
tree · ascii-boxes  # text trees
gantt-html          # standalone HTML swimlane
report-md           # full Markdown summary (tree + cost + result + errors)
code-log            # every code block paired with its observation
error-summary       # ErrorOutput counts grouped by kind
tokens              # one-line ASCII sparkline of cumulative tokens
html                # self-contained interactive stepper, one slide per snapshot
image               # single PNG/SVG/PDF of the topology snapshot
steps               # one image per snapshot, written as step_NN.{png,svg,pdf}
recursive-flow render ./myproject -f mermaid-flowchart
recursive-flow render ./myproject -f gantt-html -o run.html
recursive-flow render ./myproject -f report-md  -o run.md
recursive-flow render ./myproject -f tokens

GitHub renders mermaid inline, so the output drops straight into a doc. The example below is the to_mermaid_flowchart(graph) projection of the boids run; it renders reliably across the GitHub-supported mermaid versions:

flowchart TD
    root["root<br/><i>result</i><br/>Boids simulation written to output/boids-simulation..."]:::result
    root --> html["root.index_html<br/><i>result</i><br/>ok"]:::result
    root --> css["root.styles_css<br/><i>result</i><br/>ok"]:::result
    root --> boid["root.boid_js<br/><i>result</i><br/>ok"]:::result
    root --> sim["root.simulation_js<br/><i>result</i><br/>ok"]:::result
    root --> rend["root.renderer_js<br/><i>result</i><br/>ok"]:::result
    root --> main["root.main_js<br/><i>result</i><br/>ok"]:::result
    classDef result fill:#3fb95022,stroke:#3fb950,color:#c9d1d9;

Programmatic helpers

Everything the CLI does is one function call away:

from rflow.utils.export import to_mermaid, to_mermaid_flowchart, to_mermaid_sequence, to_dot, to_d2
from rflow.utils.viz import (
    ascii_boxes, code_log, error_summary, message_stream, diff_system_prompts,
    gantt, gantt_html, token_sparkline, budget_burndown, bench_table,
    report_md, live, tee, slack_webhook, discord_webhook,
)
from rflow.utils.tracing import json_logs

print(token_sparkline(graphs))          # ▁▂▅█▂   15820 tok over 7 steps
print(error_summary(graph))             # ErrorOutput counts grouped by kind
print(message_stream("root.boid_js", graph))     # rendered transcript for one agent
print(report_md(graphs, title="run"))   # full Markdown report
gantt_html(graphs, "run.html")          # standalone HTML swimlane
json_logs(graph, "run.jsonl")           # one node per line

Image, GIF, and HTML exports

For blog posts, PR comments, papers, and CI artifacts, render the graph straight to a PNG/SVG/PDF, an animated GIF, or a single self-contained HTML stepper. Four public functions live in rflow.utils, plus matching CLI verbs:

Function CLI verb Output Use case
save_image(graph, path) -f image one PNG/SVG/PDF hero image of a finished run
save_steps(graphs, dir/) -f steps step_NN.png per snapshot blog slideshow, paper figure series
save_gif(graphs, path) (no verb yet) animated GIF quick preview / social posts
save_html(graphs, path) -f html self-contained stepper (Plotly + CSS) shareable URL-less artifact, PR comment

Quick start:

import rflow
from rflow.utils import save_image, save_steps, save_html, save_gif

graph = rflow.Graph.load("runs/deep_research")

save_image(graph, "run_final.png")
save_html(graph, "viewer.html", title="run")

# If you kept an in-memory history list, playback exports still work:
save_steps(graphs, "frames/")                    # one PNG per step
save_gif(graphs, "trace.gif", duration=400)      # animated GIF (~2.5 fps)

Or use the graph shorthand (same defaults):

graph.save_image("run_final.png")
graph.save_html("viewer.html")

Why the scaling knobs exist

The Plotly viewer, static image export, GIF export, and HTML stepper now share the same default element scale (element_mult=1.0), so a saved PNG looks much closer to the Gradio/Jupyter view. Dense graphs still adaptively cap marker and label sizes to avoid turning large runs into solid blobs.

Use these knobs only when a target medium needs a different balance:

Knob Default Effect
element_mult 1.0 Uniform multiplier on markers and fonts. The simplest "make it bigger" knob.
marker_mult (inherits) Override just marker size and outline width. Useful when dots need more visual weight.
text_mult (inherits) Override just label font size. Smaller text means fewer label collisions.
normalize_labels True Force every label to bottom center so adjacent depths can't share a vertical band.

Pass marker_mult and/or text_mult to break the symmetry when labels are colliding or nodes are too subtle for a specific export.

Recipes

Hero PNG of a finished run — defaults are tuned for this:

graph.save_image("hero.png")
# == save_image(graph, "hero.png", width=1800, height=1350,
#               scale=2.0, element_mult=1.0, normalize_labels=True)

Blog slideshow with dense subtrees — fat markers, small labels, square-ish canvas (the recipe behind docs/blog.md):

save_steps(
    graphs,
    "blog/frames/",
    width=1600, height=1200, scale=2.0,
    marker_mult=3.5,        # fat node dots + edges
    text_mult=2.2,          # shrink labels so they don't collide
    normalize_labels=True,  # already the default — explicit for the reader
)

Standalone interactive stepper — drop into a PR comment or GitHub gist:

save_html(workspace, "viewer.html", title="needle haystack run")

The HTML output embeds Plotly from CDN, includes per-slide transcripts, and ships keyboard navigation (← / →) plus dot-style slide indicators. Open it in any browser, attach it to an email, upload it as a CI artifact — it works offline once the CDN script is cached.

Animated GIF — needs pip install recursive-flow[image] pillow:

save_gif(
    graphs,
    "trace.gif",
    duration=600,          # ms per frame; lower = faster
    loop=0,                # 0 = forever; 1 = play once
    width=1200, height=900,
)

From the CLI

Every knob above maps 1:1 to a CLI flag:

# blog slideshow recipe (matches the dense-tree recipe above)
recursive-flow render ./myproject \
  -f steps -o blog/frames/ \
  --width 1600 --height 1200 --scale 2.0 \
  --marker-mult 3.5 --text-mult 2.2

# self-contained interactive stepper
recursive-flow render ./myproject \
  -f html  -o stepper.html --title "boids walkthrough"

# single hero PNG with default scaling
recursive-flow render ./myproject \
  -f image -o hero.png

# opt out of label normalization (matches Gradio viewer defaults)
recursive-flow render ./myproject \
  -f html  -o stepper.html --no-normalize-labels

The CLI uses element_mult=1.0 by default for html, image, steps, and gif so static exports stay visually consistent with the interactive viewer. Node sizes are uniform; token counts stay in hover/details, not marker size. Override with --element-mult, --marker-mult, or --text-mult for a specific medium.

Dependencies

  • save_image / save_steps need kaleido. Install with pip install recursive-flow[image] or just pip install kaleido.
  • save_gif additionally needs Pillow (pip install recursive-flow[image] pillow).
  • save_html and render_html have no static-image dependency — they emit a single HTML file that embeds Plotly from CDN.

DSPy Adapter

RecursiveFlowLM lets DSPy use a Flow agent anywhere it expects a language model:

import dspy
import rflow
from rflow.integrations.dspy import RecursiveFlowLM

flow = rflow.Flow(
    rflow.OpenAIClient(model="gpt-4o-mini"),
    max_depth=1,
    max_iters=5,
)

dspy.configure(lm=RecursiveFlowLM(flow, model="recursive-flow/gpt-4o-mini"))
qa = dspy.ChainOfThought("question -> answer")
print(qa(question="What is 17 * 23?").answer)

Install it with pip install recursive-flow[openai,dspy]. See examples/providers/dspy_drop_in.py for the runnable version.

Examples

Run the offline smoke suite with python examples/run_examples.py. Add --include-optional, --include-live, --include-sandbox, or --include-manual as needed. Most live examples share flags like --no-viz, --docker-image recursive-flow:local, --max-depth, and --max-iters; see examples/README.md.

Example What it shows
showcase.py Functional stepping, snapshots, save/load, and live terminal visualization.
structured_output.py Root and child results validated with JSON Schema / Pydantic.
drop_in_llm.py Flow as an LLMClient, including nested flows.
skills.py On-disk skill files loaded through a dynamic prompt section.
dspy_drop_in.py Use a Flow agent as the LM behind a DSPy program.
mcp_weather.py Start a local MCP weather server, delegate city forecasts, and combine advice.
tinker_agent.py Run the live terminal graph view with TinkerClient inference.
sandboxes/ Build a small web app while Python code runs inside Modal, E2B, or Daytona.
coding/agent.py Interactive coding agent that writes and edits files in a working directory.
needle/haystack.py Needle-in-a-haystack over a massive in-memory INPUTS["haystack"].
needle/filesystem.py Needle-in-a-haystack across many files with FILE_TOOLS and runtime working directories.
summarizer.py Recursive map-reduce summarization over a long document.
eager_children.py eager_children=True vs False — how child scheduling overlaps.
control/injection/ Generate a baseline run, edit copies with graph injection/replacement, and continue variants.
fork_repair.py Fork graph/workdir snapshots into independent repair branches and compare results.
best_of_n.py Run N independent branches and pick the best result.
autoresearch/ Karpathy-style hill-climbing research loop with custom @tools and delegation.
graph/ Offline tour of the Graph API: query, navigate, mutate, save/load, timeline retrace, fork, render.
run_examples.py Manifest-driven smoke runner for offline, optional, live, sandbox, and manual examples.
view_demo.py Build synthetic Graph snapshots and launch the Gradio viewer.
notebooks/coding_agent.ipynb Build the agent, run the boids task end-to-end, and inspect the saved run/viewer.
notebooks/viz_walkthrough.ipynb Visualization helpers against a saved fixture.
notebooks/node_basics.ipynb Graph query API tour.

Benchmarks

The shared eval harness lives under benchmarks/eval/. It uses a task/runner registry, writes results.jsonl + summary.json, records rflow graph-shape metrics, shows tqdm progress bars, and can log per-row metrics to W&B. Real runs can compare vanilla, rflow, and the upstream official RLM runner ported from avilum/minrlm/eval. It also writes model-oriented reports under eval-runs/<model>/<benchmark>/, including per-question JSON files with prompt, inputs, expected answer, and each runner's solution.

python -m benchmarks.eval \
  --provider fake \
  --model fake \
  --tasks sniah \
  --runners fake vanilla rflow \
  --seeds 0:3

To run the full RLM-Bench-style table sweep with W&B logging:

make eval-benchmark EVAL_MODEL=gpt-5-mini

See benchmarks/eval/README.md for task/runner extension points and W&B usage.

CLI

recursive-flow view ./myproject
recursive-flow render ./myproject -f mermaid
recursive-flow render ./myproject -f gantt-html -o run1.html
recursive-flow render ./myproject -f html       -o stepper.html
recursive-flow render ./myproject -f steps      -o frames/  --marker-mult 3.5 --text-mult 2.2
recursive-flow render ./myproject -f image      -o graph.png
recursive-flow version

view and render accept a workspace directory. render -f accepts: mermaid, mermaid-flowchart, mermaid-sequence, dot, d2, tree, ascii-boxes, gantt-html, report-md, code-log, error-summary, tokens, html, image, steps — see the Static renders table and Image, GIF, and HTML exports for what each produces and the scaling / label-normalization flags (--marker-mult, --text-mult, --normalize-labels / --no-normalize-labels).

Roadmap

  • OOLONG long-context aggregation harness (standard / rlm / rlm_tips)
  • LocalRuntime + DockerRuntime — battle-tested
  • [~] ModalRuntime / E2BRuntime / DaytonaRuntime — full support: native SDK file transfer, real-sandbox CI, depth>1 delegation, heavier example
  • [~] OOLONG, LongBench-v2, CodeQA, SWE-bench, etc. benchmarks benchmarks
  • REPL security (local)
  • RAO library module: rflow.rao rollout collection, per-node rewards, leave-one-out advantages, depth weighting, trainer export
  • DeLM-style coordination: shared task queue, verified shared context, multi-worker coordinator over Flow graphs

Docs

The top-level docs are short, user-facing guides. The deep dive lives in docs/internals.md. Research notes live under docs/research/.

  • Internals: deep reference — engine architecture, step lifecycle, REPL await protocol, runtime backends, graph persistence, and extension seams. This document is being refreshed after the Flow/Graph rewrite.
  • Blog post: long-form pitch — why recursive language models, why graphs over flat traces, full needle-in-a-haystack walkthrough with the same exports the CLI ships.
  • Positioning: when to use recursive-flow vs rlm-minimal, ypi, LangGraph, CrewAI, AutoGen, SWE-agent, Aider.
  • Control: step loop, save/load resume, rewind, forks, INPUTS, launch_subagents, inline-first strategy, custom tools.
  • Node injection: append typed controller events to a running graph and commit them through agent.step(graph).
  • Observability: querying the Graph, run layout, export helpers, live tree, gantt, topology exports, Gradio viewer, CLI.
  • Runtimes: Runtime protocol, shipped runtimes (Local / Docker / Modal / E2B / Daytona), writing your own.
  • Prompt customization: PromptBuilder sections, callable dynamic sections, workspace-backed skills/memory, deriving from the default prompt, full replacement.
  • Security: trust model, Docker isolation knobs, engine-level caps, proxied tools, approval gates.
  • Changelog: release-by-release changes.

References

  • Recursive Language Models: the original RLM paper and implementation.
  • rlm-minimal: the single-file reference recursive-flow grew from.
  • Scaling Managed Agents: Decoupling the brain from the hands: Anthropic's writeup on separating harness, session, and sandbox interfaces for long-horizon agents.
  • ypi: recursive coding agent built on Pi. Our session layout and much of the default prompt (size-up → delegate → combine, guardrails, aggressive delegation) come from ypi's SYSTEM_PROMPT.md.

License

See LICENSE.

Citation

@misc{sudhakaran2025recursive-flow,
  author = {Sudhakaran, Shyam},
  title = {recursive-flow},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/shyamsn97/recursive-flow}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recursive_flow-0.4.2.tar.gz (7.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

recursive_flow-0.4.2-py3-none-any.whl (140.1 kB view details)

Uploaded Python 3

File details

Details for the file recursive_flow-0.4.2.tar.gz.

File metadata

  • Download URL: recursive_flow-0.4.2.tar.gz
  • Upload date:
  • Size: 7.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for recursive_flow-0.4.2.tar.gz
Algorithm Hash digest
SHA256 f6a7c72e83b48f49cf4147d7b9710c8f658365da00343ef9a0122a02317272b3
MD5 b7198122d2b069398d799d2835bee79f
BLAKE2b-256 8ebfaad0d027361ad3a19eec7480adb35130efaa5b78db36317e49ec0f6d9fab

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_flow-0.4.2.tar.gz:

Publisher: release.yml on shyamsn97/recursive-flow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file recursive_flow-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: recursive_flow-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 140.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for recursive_flow-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 db29f688fcad70dd8f733f049d703b79192cfb79a2ac8e10ec18a4f22117367c
MD5 4bd797043ec2265a2c0523e9987cd998
BLAKE2b-256 71bf4a57c77fe6764497359bc5c86048d37c2eaef425f534a984b199be29c3fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for recursive_flow-0.4.2-py3-none-any.whl:

Publisher: release.yml on shyamsn97/recursive-flow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page