Recursive Language Model - Python 3.11+ implementation
Project description
RLM - Recursive Language Model
A modern Python 3.11+ implementation of the Recursive Language Model paradigm from MIT CSAIL research (arXiv:2512.24601).
What is RLM?
Unlike traditional RAG (Retrieval-Augmented Generation), RLM treats document context as an external variable in a Python REPL environment. The LLM doesn't see the full document - instead, it writes Python code to:
- Inspect the document (
len(CONTEXT),CONTEXT[:1000]) - Search it (
re.findall(r'pattern', CONTEXT)) - Chunk it and process recursively (
llm_query(f"Summarize: {chunk}")) - Synthesize results and return a final answer (
FINAL(answer))
This approach enables processing of documents far exceeding typical context windows while maintaining adaptive, task-specific exploration strategies.
Architecture
┌─────────────────────────────────────────────────────────┐
│ User Query │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ RLM Orchestrator │
│ • Manages iteration loop │
│ • Parses code blocks from LLM response │
│ • Detects FINAL() answers │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ Root LLM │
│ • Receives query + system prompt (NOT the context!) │
│ • Generates Python code to explore CONTEXT │
│ • Calls llm_query() for sub-processing │
└─────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────┐
│ REPL Environment │
│ • CONTEXT variable (holds the massive text) │
│ • llm_query(prompt) function for recursive calls │
│ • FINAL(answer) / FINAL_VAR(var) for returning results │
│ • Isolation delegated to container runtime │
└─────────────────────────────────────────────────────────┘
Installation
pip install -r requirements.txt
Requirements:
- Python 3.11+ (tested with 3.13+)
anthropic>=0.39.0(for Anthropic API backend)openai>=1.0.0(optional, for local models via Ollama/vLLM)
Quick Start
Using Anthropic API
from rlm import RLM
from rlm.backends import AnthropicBackend
# Initialize with API key (or set ANTHROPIC_API_KEY env var)
backend = AnthropicBackend()
rlm = RLM(
backend,
model="claude-sonnet-4-20250514",
recursive_model="claude-haiku-3-20250813", # Cheaper model for sub-calls
verbose=True,
)
# Process a large document
with open("large_document.txt") as f:
context = f.read()
result = rlm.completion(
context=context,
query="What are the main themes discussed in this document?"
)
print(result.answer)
print(rlm.cost_summary())
Using OpenAI
OPENAI_API_KEY=... uvx --with openai rlm --backend openai --context-file doc.txt --query "Summarize"
Using OpenRouter
OPENROUTER_API_KEY=... uvx --with openai rlm --backend openrouter --context-file doc.txt --query "Summarize"
Using Hugging Face
HF_TOKEN=... uvx --with openai rlm --backend huggingface --context-file doc.txt --query "Summarize"
Using Local Models (Ollama)
from rlm import RLM
from rlm.backends import OpenAICompatibleBackend
backend = OpenAICompatibleBackend(
base_url="http://localhost:11434/v1",
api_key="ollama",
)
rlm = RLM(backend, model="llama3.2", verbose=True)
result = rlm.completion(context=doc, query="Summarize this document")
Demo
# With Anthropic API (set ANTHROPIC_API_KEY)
python demo.py --verbose
# With Ollama
python demo.py --backend ollama --model llama3.2 --verbose
# Custom context file
python demo.py --context-file /path/to/document.txt --query "Your question here"
API Reference
RLM Class
RLM(
backend: LLMBackend, # Backend instance
model: str = "claude-sonnet-4-20250514", # Root LLM model
recursive_model: str = None, # Model for llm_query (defaults to model)
max_iterations: int = 10, # Max REPL iterations
max_depth: int = 3, # Max recursion depth for llm_query
verbose: bool = False, # Print debug output
compact_prompt: bool = False, # Use shorter system prompt
)
Methods:
completion(context: str, query: str) -> RLMResult- Sync completionacompletion(context: str, query: str) -> RLMResult- Async completioncost_summary() -> dict- Get usage statistics
RLMResult
@dataclass
class RLMResult:
answer: str # Final answer
stats: RLMStats # Statistics
history: list[dict] # Iteration history
success: bool # Whether completion succeeded
error: str | None # Error message if failed
Backends
| Backend | Use Case | Default Model |
|---|---|---|
AnthropicBackend |
Direct Anthropic API | claude-sonnet-4-20250514 |
OpenAICompatibleBackend |
OpenAI, OpenRouter, Hugging Face, Ollama, vLLM, etc. | (varies by preset) |
CLI Backend Presets
--backend |
Provider | API Key Env Var | Default Model |
|---|---|---|---|
anthropic |
Anthropic | ANTHROPIC_API_KEY |
claude-sonnet-4-20250514 |
openai |
OpenAI | OPENAI_API_KEY |
gpt-4o |
openrouter |
OpenRouter | OPENROUTER_API_KEY |
anthropic/claude-sonnet-4 |
huggingface |
Hugging Face | HF_TOKEN |
Qwen/Qwen2.5-Coder-32B-Instruct |
ollama |
Ollama (local) | (none) | qwen3-coder:32b |
claude |
Claude CLI | (none) | claude-sonnet-4-20250514 |
REPL Environment
The REPL provides these to the LLM:
| Name | Type | Description |
|---|---|---|
CONTEXT |
str | The full document (never print directly!) |
llm_query(prompt) |
function | Call sub-LLM, returns string |
FINAL(answer) |
function | Set final answer and complete |
FINAL_VAR(var_name) |
function | Set variable as final answer |
re, json, math, collections, itertools |
modules | Pre-imported |
Isolation: The REPL is designed to run inside a rootless container. No in-process sandboxing is applied.
Project Structure
spike-claude-code-rlm/
├── rlm/
│ ├── __init__.py # Package exports
│ ├── repl.py # REPL environment (REPLEnv, REPLResult)
│ ├── backends.py # LLM backends (Anthropic, OpenAI-compatible)
│ ├── prompts.py # System prompts for root LLM
│ └── rlm.py # Core orchestrator (RLM, RLMResult, RLMStats)
├── demo.py # CLI demo with multiple backends
├── sample_data/
│ └── large_document.txt
├── requirements.txt
├── pyproject.toml
├── README.md
└── LICENSE
Features
✅ Python 3.11+ Modern Implementation
- Type hints with modern syntax (compatible with 3.13+)
- Dataclasses for clean data structures
- Async support (acompletion)
✅ Multiple LLM Backends
- Anthropic (Claude)
- OpenAI-compatible (Ollama, vLLM, etc.)
- Claude CLI backend
✅ Container-Isolated Execution
- Designed for rootless container runtimes
- No in-process sandbox overhead
- Full Python stdlib available to LLM-generated code
✅ Recursive Processing
- Configurable recursion depth
- Separate models for root/recursive calls
- Cost tracking and statistics
References
- arXiv Paper: Recursive Language Models
- Alex Zhang's Blog Post
- Official Implementation
- MIT CSAIL OASYS Lab
License
BSD 2-Clause License - See LICENSE file for details.
Based on research by Alex L. Zhang, Tim Kraska, and Omar Khattab (MIT CSAIL).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rlm_loop-0.1.0.tar.gz.
File metadata
- Download URL: rlm_loop-0.1.0.tar.gz
- Upload date:
- Size: 157.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca137847b6893da71c971b436f2ce1cc6db93d5debb022978059e90e99f753a8
|
|
| MD5 |
05b23f30d9cb6a6eea1c3307126c3f38
|
|
| BLAKE2b-256 |
8f59e1524af421abe44634c809b7add0f34331194b076e3ffdc0f33161061d32
|
Provenance
The following attestation bundles were made for rlm_loop-0.1.0.tar.gz:
Publisher:
pypi-publish.yaml on ondrasek/spike-claude-code-rlm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rlm_loop-0.1.0.tar.gz -
Subject digest:
ca137847b6893da71c971b436f2ce1cc6db93d5debb022978059e90e99f753a8 - Sigstore transparency entry: 976442716
- Sigstore integration time:
-
Permalink:
ondrasek/spike-claude-code-rlm@6c98f1930812245074ffcfe10a84702b7bfc544b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ondrasek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yaml@6c98f1930812245074ffcfe10a84702b7bfc544b -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file rlm_loop-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rlm_loop-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a40a34aa6a9592c3b29c5efe9466417fd983170d1f999c665db314a4f645ca01
|
|
| MD5 |
a193182d68e21fe4248659d61102023d
|
|
| BLAKE2b-256 |
3819d0832b6b8ae85f0222a3330556d3cb766a437e00b28e53ef4648924b1119
|
Provenance
The following attestation bundles were made for rlm_loop-0.1.0-py3-none-any.whl:
Publisher:
pypi-publish.yaml on ondrasek/spike-claude-code-rlm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rlm_loop-0.1.0-py3-none-any.whl -
Subject digest:
a40a34aa6a9592c3b29c5efe9466417fd983170d1f999c665db314a4f645ca01 - Sigstore transparency entry: 976442718
- Sigstore integration time:
-
Permalink:
ondrasek/spike-claude-code-rlm@6c98f1930812245074ffcfe10a84702b7bfc544b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/ondrasek
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yaml@6c98f1930812245074ffcfe10a84702b7bfc544b -
Trigger Event:
workflow_dispatch
-
Statement type: