Skip to main content

Recursive Language Model - Python 3.11+ implementation

Project description

RLM - Recursive Language Model

A modern Python 3.11+ implementation of the Recursive Language Model paradigm from MIT CSAIL research (arXiv:2512.24601).

What is RLM?

Unlike traditional RAG (Retrieval-Augmented Generation), RLM treats document context as an external variable in a Python REPL environment. The LLM doesn't see the full document - instead, it writes Python code to:

  1. Inspect the document (len(CONTEXT), CONTEXT[:1000])
  2. Search it (re.findall(r'pattern', CONTEXT))
  3. Chunk it and process recursively (llm_query(f"Summarize: {chunk}"))
  4. Synthesize results and return a final answer (FINAL(answer))

This approach enables processing of documents far exceeding typical context windows while maintaining adaptive, task-specific exploration strategies.

Architecture

┌─────────────────────────────────────────────────────────┐
│                      User Query                          │
└─────────────────────┬───────────────────────────────────┘
                      ▼
┌─────────────────────────────────────────────────────────┐
│                    RLM Orchestrator                      │
│  • Manages iteration loop                                │
│  • Parses code blocks from LLM response                  │
│  • Detects FINAL() answers                               │
└─────────────────────┬───────────────────────────────────┘
                      ▼
┌─────────────────────────────────────────────────────────┐
│                    Root LLM                              │
│  • Receives query + system prompt (NOT the context!)     │
│  • Generates Python code to explore CONTEXT              │
│  • Calls llm_query() for sub-processing                  │
└─────────────────────┬───────────────────────────────────┘
                      ▼
┌─────────────────────────────────────────────────────────┐
│                    REPL Environment                      │
│  • CONTEXT variable (holds the massive text)             │
│  • llm_query(prompt) function for recursive calls        │
│  • FINAL(answer) / FINAL_VAR(var) for returning results  │
│  • Isolation delegated to container runtime               │
└─────────────────────────────────────────────────────────┘

Installation

pip install -r requirements.txt

Requirements:

  • Python 3.11+ (tested with 3.13+)
  • anthropic>=0.39.0 (for Anthropic API backend)
  • openai>=1.0.0 (optional, for local models via Ollama/vLLM)

Quick Start

Using Anthropic API

from rlm import RLM
from rlm.backends import AnthropicBackend

# Initialize with API key (or set ANTHROPIC_API_KEY env var)
backend = AnthropicBackend()
rlm = RLM(
    backend,
    model="claude-sonnet-4-20250514",
    recursive_model="claude-haiku-3-20250813",  # Cheaper model for sub-calls
    verbose=True,
)

# Process a large document
with open("large_document.txt") as f:
    context = f.read()

result = rlm.completion(
    context=context,
    query="What are the main themes discussed in this document?"
)

print(result.answer)
print(rlm.cost_summary())

Using OpenAI

OPENAI_API_KEY=... uvx --with openai rlm --backend openai --context-file doc.txt --query "Summarize"

Using OpenRouter

OPENROUTER_API_KEY=... uvx --with openai rlm --backend openrouter --context-file doc.txt --query "Summarize"

Using Hugging Face

HF_TOKEN=... uvx --with openai rlm --backend huggingface --context-file doc.txt --query "Summarize"

Using Local Models (Ollama)

from rlm import RLM
from rlm.backends import OpenAICompatibleBackend

backend = OpenAICompatibleBackend(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
)
rlm = RLM(backend, model="llama3.2", verbose=True)

result = rlm.completion(context=doc, query="Summarize this document")

Demo

# With Anthropic API (set ANTHROPIC_API_KEY)
python demo.py --verbose

# With Ollama
python demo.py --backend ollama --model llama3.2 --verbose

# Custom context file
python demo.py --context-file /path/to/document.txt --query "Your question here"

API Reference

RLM Class

RLM(
    backend: LLMBackend,           # Backend instance
    model: str = "claude-sonnet-4-20250514",  # Root LLM model
    recursive_model: str = None,   # Model for llm_query (defaults to model)
    max_iterations: int = 10,      # Max REPL iterations
    max_depth: int = 3,            # Max recursion depth for llm_query
    verbose: bool = False,         # Print debug output
    compact_prompt: bool = False,  # Use shorter system prompt
)

Methods:

  • completion(context: str, query: str) -> RLMResult - Sync completion
  • acompletion(context: str, query: str) -> RLMResult - Async completion
  • cost_summary() -> dict - Get usage statistics

RLMResult

@dataclass
class RLMResult:
    answer: str              # Final answer
    stats: RLMStats          # Statistics
    history: list[dict]      # Iteration history
    success: bool            # Whether completion succeeded
    error: str | None        # Error message if failed

Backends

Backend Use Case Default Model
AnthropicBackend Direct Anthropic API claude-sonnet-4-20250514
OpenAICompatibleBackend OpenAI, OpenRouter, Hugging Face, Ollama, vLLM, etc. (varies by preset)

CLI Backend Presets

--backend Provider API Key Env Var Default Model
anthropic Anthropic ANTHROPIC_API_KEY claude-sonnet-4-20250514
openai OpenAI OPENAI_API_KEY gpt-4o
openrouter OpenRouter OPENROUTER_API_KEY anthropic/claude-sonnet-4
huggingface Hugging Face HF_TOKEN Qwen/Qwen2.5-Coder-32B-Instruct
ollama Ollama (local) (none) qwen3-coder:32b
claude Claude CLI (none) claude-sonnet-4-20250514

REPL Environment

The REPL provides these to the LLM:

Name Type Description
CONTEXT str The full document (never print directly!)
llm_query(prompt) function Call sub-LLM, returns string
FINAL(answer) function Set final answer and complete
FINAL_VAR(var_name) function Set variable as final answer
re, json, math, collections, itertools modules Pre-imported

Isolation: The REPL is designed to run inside a rootless container. No in-process sandboxing is applied.

Project Structure

spike-claude-code-rlm/
├── rlm/
│   ├── __init__.py      # Package exports
│   ├── repl.py          # REPL environment (REPLEnv, REPLResult)
│   ├── backends.py      # LLM backends (Anthropic, OpenAI-compatible)
│   ├── prompts.py       # System prompts for root LLM
│   └── rlm.py           # Core orchestrator (RLM, RLMResult, RLMStats)
├── demo.py              # CLI demo with multiple backends
├── sample_data/
│   └── large_document.txt
├── requirements.txt
├── pyproject.toml
├── README.md
└── LICENSE

Features

Python 3.11+ Modern Implementation

  • Type hints with modern syntax (compatible with 3.13+)
  • Dataclasses for clean data structures
  • Async support (acompletion)

Multiple LLM Backends

  • Anthropic (Claude)
  • OpenAI-compatible (Ollama, vLLM, etc.)
  • Claude CLI backend

Container-Isolated Execution

  • Designed for rootless container runtimes
  • No in-process sandbox overhead
  • Full Python stdlib available to LLM-generated code

Recursive Processing

  • Configurable recursion depth
  • Separate models for root/recursive calls
  • Cost tracking and statistics

References

License

BSD 2-Clause License - See LICENSE file for details.

Based on research by Alex L. Zhang, Tim Kraska, and Omar Khattab (MIT CSAIL).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlm_loop-0.1.0.tar.gz (157.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rlm_loop-0.1.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file rlm_loop-0.1.0.tar.gz.

File metadata

  • Download URL: rlm_loop-0.1.0.tar.gz
  • Upload date:
  • Size: 157.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rlm_loop-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ca137847b6893da71c971b436f2ce1cc6db93d5debb022978059e90e99f753a8
MD5 05b23f30d9cb6a6eea1c3307126c3f38
BLAKE2b-256 8f59e1524af421abe44634c809b7add0f34331194b076e3ffdc0f33161061d32

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlm_loop-0.1.0.tar.gz:

Publisher: pypi-publish.yaml on ondrasek/spike-claude-code-rlm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rlm_loop-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rlm_loop-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rlm_loop-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a40a34aa6a9592c3b29c5efe9466417fd983170d1f999c665db314a4f645ca01
MD5 a193182d68e21fe4248659d61102023d
BLAKE2b-256 3819d0832b6b8ae85f0222a3330556d3cb766a437e00b28e53ef4648924b1119

See more details on using hashes here.

Provenance

The following attestation bundles were made for rlm_loop-0.1.0-py3-none-any.whl:

Publisher: pypi-publish.yaml on ondrasek/spike-claude-code-rlm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page