Skip to main content

A state-machine framework for recursive LLM agents in ~1,200 lines of Python.

Project description

rlmkit

A minimal state-machine library for Recursive Language Model agents. Every agent — root and descendants — advances one step at a time, and the entire computation tree is a single immutable, serializable object at every step boundary.

rlmkit animation

Install

pip install rlmkit               # core
pip install rlmkit[openai]       # + OpenAI client
pip install rlmkit[anthropic]    # + Anthropic client
pip install rlmkit[viewer]       # + Gradio viewer
pip install rlmkit[all]          # all of the above

From source:

git clone https://github.com/shyamsn97/rlmkit && cd rlmkit
pip install -e .

Quick start

from rlmkit import RLM, RLMConfig, OpenAIClient
from rlmkit.runtime.local import LocalRuntime
from rlmkit.tools import FILE_TOOLS
from rlmkit.utils.viewer import save_trace, open_viewer

runtime = LocalRuntime(workspace="./myproject")
runtime.register_tools(FILE_TOOLS)

agent = RLM(
    llm_client=OpenAIClient("gpt-5"),
    runtime=runtime,
    config=RLMConfig(max_depth=3, max_iterations=15, session="context"),
)

query = "Build a python text-based adventure game with combat and inventory."
states = [agent.start(query)]
while not states[-1].finished:
    states.append(agent.step(states[-1]))
    print(states[-1].tree())

save_trace(states, "traces/run1", query=query)
open_viewer(states, query=query)

Examples

All examples share the same CLI flags — --no-viz, --docker-image rlmkit:local, --max-depth, etc. See examples/README.md.

Example What it shows
showcase.py Checkpoint, fork, session persistence, time travel, intervention — the full API tour.
drop_in_llm.py RLM as an LLMClient. Nested agents.
coding-agent/agent.py Interactive coding agent that writes and edits files.
needle_haystack.py Needle-in-a-haystack across 500 files with custom tools and runtime_factory.
summarizer.py Recursive map-reduce over a 10k-line document.
view_demo.py Launch the Gradio viewer on a saved trace.

Overview

RLM implements LLMClient — it's a drop-in replacement for any LLM. Call chat(messages) or run(query) and it runs a full recursive agent loop underneath. Swap your LLM for an RLM and get delegation, parallel sub-agents, and a code REPL for free.

def ask(llm: LLMClient, q: str) -> str:
    return llm.chat([{"role": "user", "content": q}])

ask(OpenAIClient("gpt-4o-mini"), "2+2?")             # one LLM call
ask(RLM(llm_client=..., runtime=...), "2+2?")        # full agent, same return type

Nest agents by passing one RLM as another's llm_client.

The tree is a state machine. Every agent advances one step at a time, and the full computation is one object you can inspect, checkpoint, fork, or serialize at any step boundary:

root [supervising] iter 5
├── root.scanner_auth [finished] iter 3 → "Found SQL injection in login.py"
│   ├── root.scanner_auth.chunk_0 [finished] iter 2 → "No issues"
│   ├── root.scanner_auth.chunk_1 [finished] iter 2 → "SQL injection on line 42"
│   └── root.scanner_auth.chunk_2 [finished] iter 2 → "No issues"
├── root.scanner_api [supervising] iter 3
│   ├── root.scanner_api.chunk_0 [ready] iter 1
│   ├── root.scanner_api.chunk_1 [finished] iter 2 → "Clean"
│   │   └── root.scanner_api.chunk_1.deep_scan [finished] iter 2 → "Payment flow is safe"
│   └── root.scanner_api.chunk_2 [finished] iter 2 → "Clean"
└── root.scanner_db [finished] iter 2 → "No issues found"

Each step(state) -> state' is one atomic transition:

        step_llm()              step_exec()
READY ─────────────> EXECUTING ─────────────> SUPERVISING
  ^                      |                        |
  |                   done()                step_children()
  |                      |                   (one batch)
  |                      v                        |
  |                  FINISHED <── resume_exec() ──┤
  |                      ^                        |
  +── yields again ──────┘                  children not done
  • READY — queued for the next LLM call.
  • EXECUTING — LLM returned a ```repl ``` block; the engine runs it.
  • SUPERVISING — code called delegate() + yield wait(); children are running. Each step() advances children by one batch.
  • FINISHED — code called done(result).

Delegation:

h1 = delegate("searcher", "Find all TODOs in src/")
h2 = delegate("searcher", "Find all FIXMEs in src/")   # auto-suffixed
results = yield wait(h1, h2)
done(f"Found {len(results)} batches")

Docs

  • Positioning — when to use rlmkit, when not to.
  • ObservabilityRLMState, events, traces, sessions, the viewer.
  • Control — step loop, checkpoint, fork, rewind, intervene, custom prompts / state / tools.
  • RuntimesRuntime protocol, Local / Subprocess / Docker / Modal, writing your own.
  • Security — trust model, Docker isolation knobs, approval gates.
  • Changelog.

References

  • Recursive Language Models — the original RLM paper and implementation.
  • rlm-minimal — the single-file reference rlmkit grew from.
  • ypi — recursive coding agent built on Pi. Our session layout and much of the default prompt (size-up → delegate → combine, guardrails, aggressive delegation) come from ypi's SYSTEM_PROMPT.md.

License

See LICENSE.

Citation

@misc{sudhakaran2025rlmkit,
  author = {Sudhakaran, Shyam},
  title = {rlmkit},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/shyamsn97/rlmkit}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlmkit-0.1.0.tar.gz (7.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rlmkit-0.1.0-py3-none-any.whl (53.5 kB view details)

Uploaded Python 3

File details

Details for the file rlmkit-0.1.0.tar.gz.

File metadata

  • Download URL: rlmkit-0.1.0.tar.gz
  • Upload date:
  • Size: 7.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rlmkit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b9bacbc771d7f178a7658468589aad719651a26bf354c2ea068faaade80a0cc1
MD5 165b2276a8267a6e061640d6dcfe7bbc
BLAKE2b-256 13aca482610244de7bc0223c52396c6af6b243442afeed46cddfd2da495bea4a

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlmkit-0.1.0.tar.gz:

Publisher: release.yml on shyamsn97/rlmkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rlmkit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rlmkit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 53.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rlmkit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64fa80304aff1e47a7e47002bb58c5c76612072aa68752de7ab298d684c39343
MD5 55d3e9003a646aff882e4f816da25641
BLAKE2b-256 64649542ccef500816e3d1e49e62b62532edc0417a3d0e8c28c32078a4627ae8

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlmkit-0.1.0-py3-none-any.whl:

Publisher: release.yml on shyamsn97/rlmkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page