Skip to main content

A state-machine framework for recursive LLM agents in ~1,200 lines of Python.

Project description

rlmkit

A minimal state-machine library for Recursive Language Model agents. Every agent — root and descendants — advances one step at a time, and the entire computation tree is a single immutable, serializable object at every step boundary.

rlmkit animation

Install

pip install rlmkit               # core
pip install rlmkit[openai]       # + OpenAI client
pip install rlmkit[anthropic]    # + Anthropic client
pip install rlmkit[viewer]       # + Gradio viewer
pip install rlmkit[all]          # all of the above

From source:

git clone https://github.com/shyamsn97/rlmkit && cd rlmkit
pip install -e .

Quick start

from rlmkit import RLM, RLMConfig, OpenAIClient
from rlmkit.runtime.local import LocalRuntime
from rlmkit.tools import FILE_TOOLS
from rlmkit.utils.trace import save_trace
from rlmkit.utils.viewer import open_viewer

workspace = "./myproject"
runtime = LocalRuntime(workspace=workspace)

# Sandbox agent code inside Docker instead — drop-in replacement,
# same interface.  Build the image once with `docker build -t rlmkit:local .`
# from the repo root; see docs/runtimes.md and docs/security.md.
#
# from rlmkit.runtime.docker import DockerRuntime
# runtime = DockerRuntime(
#     "rlmkit:local",
#     workspace=workspace,
#     mounts={workspace: "/workspace"},
#     workdir="/workspace",
# )

runtime.register_tools(FILE_TOOLS)

agent = RLM(
    llm_client=OpenAIClient("gpt-5"),
    runtime=runtime,
    config=RLMConfig(max_depth=3, max_iterations=15, session="context"),
)

query = "Build a python text-based adventure game with combat and inventory."
states = [agent.start(query)]
while not states[-1].finished:
    states.append(agent.step(states[-1]))
    print(states[-1].tree())

save_trace(states, "traces/run1")
open_viewer(states)

Examples

All examples share the same CLI flags — --no-viz, --docker-image rlmkit:local, --max-depth, etc. See examples/README.md.

Example What it shows
showcase.py Checkpoint, fork, session persistence, time travel, intervention — the full API tour.
drop_in_llm.py RLM as an LLMClient. Nested agents.
coding-agent/agent.py Interactive coding agent that writes and edits files.
needle_haystack.py Needle-in-a-haystack across 500 files with custom tools and runtime_factory.
summarizer.py Recursive map-reduce over a 10k-line document.
view_demo.py Launch the Gradio viewer on a saved trace.

CLI

rlmkit view   traces/run1/                          # Gradio viewer
rlmkit render checkpoint.json -f mermaid            # stdout
rlmkit render traces/run1/   -f gantt-html -o run1.html
rlmkit version

view and render accept either a trace directory, a trace.json, or a single state checkpoint. Formats: mermaid, dot, tree, gantt-html.

Overview

RLM implements LLMClient — it's a drop-in replacement for any LLM. Call chat(messages) or run(query) and it runs a full recursive agent loop underneath. Swap your LLM for an RLM and get delegation, parallel sub-agents, and a code REPL for free.

def ask(llm: LLMClient, q: str) -> str:
    return llm.chat([{"role": "user", "content": q}])

ask(OpenAIClient("gpt-4o-mini"), "2+2?")             # one LLM call
ask(RLM(llm_client=..., runtime=...), "2+2?")        # full agent, same return type

Nest agents by passing one RLM as another's llm_client.

The tree is a state machine. Every agent advances one step at a time, and the full computation is one object you can inspect, checkpoint, fork, or serialize at any step boundary:

root [supervising] iter 5
├── root.scanner_auth [finished] iter 3 → "Found SQL injection in login.py"
│   ├── root.scanner_auth.chunk_0 [finished] iter 2 → "No issues"
│   ├── root.scanner_auth.chunk_1 [finished] iter 2 → "SQL injection on line 42"
│   └── root.scanner_auth.chunk_2 [finished] iter 2 → "No issues"
├── root.scanner_api [supervising] iter 3
│   ├── root.scanner_api.chunk_0 [ready] iter 1
│   ├── root.scanner_api.chunk_1 [finished] iter 2 → "Clean"
│   │   └── root.scanner_api.chunk_1.deep_scan [finished] iter 2 → "Payment flow is safe"
│   └── root.scanner_api.chunk_2 [finished] iter 2 → "Clean"
└── root.scanner_db [finished] iter 2 → "No issues found"

Each step(state) -> state' is one atomic transition:

        step_llm()              step_exec()
READY ─────────────> EXECUTING ─────────────> SUPERVISING
  ^                      |                        |
  |                   done()                step_children()
  |                      |                   (one batch)
  |                      v                        |
  |                  FINISHED <── resume_exec() ──┤
  |                      ^                        |
  +── yields again ──────┘                  children not done
  • READY — queued for the next LLM call.
  • EXECUTING — LLM returned a ```repl ``` block; the engine runs it.
  • SUPERVISING — code called delegate() + yield wait(); children are running. Each step() advances children by one batch.
  • FINISHED — code called done(result).

Delegation:

h1 = delegate("searcher", "Find all TODOs in src/")
h2 = delegate("searcher", "Find all FIXMEs in src/")   # auto-suffixed
results = yield wait(h1, h2)
done(f"Found {len(results)} batches")

Docs

  • Positioning — when to use rlmkit, when not to.
  • ObservabilityRLMState, events, traces, sessions, the viewer.
  • Control — step loop, checkpoint, fork, rewind, intervene, custom prompts / state / tools.
  • RuntimesRuntime protocol, Local / Subprocess / Docker / Modal, writing your own.
  • Security — trust model, Docker isolation knobs, approval gates.

References

  • Recursive Language Models — the original RLM paper and implementation.
  • rlm-minimal — the single-file reference rlmkit grew from.
  • ypi — recursive coding agent built on Pi. Our session layout and much of the default prompt (size-up → delegate → combine, guardrails, aggressive delegation) come from ypi's SYSTEM_PROMPT.md.

License

See LICENSE.

Citation

@misc{sudhakaran2025rlmkit,
  author = {Sudhakaran, Shyam},
  title = {rlmkit},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/shyamsn97/rlmkit}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlmkit-0.1.1.tar.gz (7.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rlmkit-0.1.1-py3-none-any.whl (57.4 kB view details)

Uploaded Python 3

File details

Details for the file rlmkit-0.1.1.tar.gz.

File metadata

  • Download URL: rlmkit-0.1.1.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rlmkit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f50fcd7357a50e39d0fb0125fe21390acb8b2dfc81f280d39d6e2c5c5d5a49c4
MD5 a78eb665149c98e4cead9059093dbb30
BLAKE2b-256 47a16c17e05ce8f59276972b88539b6474c7395341110965c8b5991539e6d740

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlmkit-0.1.1.tar.gz:

Publisher: release.yml on shyamsn97/rlmkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rlmkit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: rlmkit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 57.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rlmkit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4898f8ee75ec53f9be0b145448d0c7fe2a45012ba2dc538b15684c2ba581b0d8
MD5 e2894c7b544b16d26ab14c1171532d3f
BLAKE2b-256 f19d46d96ee903df2590624f8294be9257f3320b44914088f1e030b5745ecb69

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlmkit-0.1.1-py3-none-any.whl:

Publisher: release.yml on shyamsn97/rlmkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page