Recursive Language Model runtime with sandboxed REPL execution
Project description
RLM Runtime
Recursive Language Model runtime with sandboxed REPL execution.
RLM Runtime enables LLMs to recursively decompose tasks, execute real code in isolated environments, and retrieve context on demand. Instead of simulating computation in tokens, the model executes actual code—cheaper, more reliable, and auditable.
Features
- Recursive Completion - LLMs can spawn sub-calls, execute code, and aggregate results
- Sandboxed REPL - Local (RestrictedPython) or Docker isolation
- Multi-Provider - OpenAI, Anthropic, and 100+ providers via LiteLLM
- Streaming - Real-time token streaming for simple completions
- Trajectory Logging - Full execution traces in JSONL format
- Trajectory Visualizer - Interactive Streamlit dashboard for debugging
- MCP Server - Claude Desktop/Code integration with multi-project support
- Plugin System - Extend with custom tools
- Snipara Integration - Optional context optimization (recommended)
Installation
# Basic install
pip install rlm-runtime
# With Docker support (recommended)
pip install rlm-runtime[docker]
# With MCP server (for Claude Desktop/Code)
pip install rlm-runtime[mcp]
# With Snipara context optimization
pip install rlm-runtime[snipara]
# Full install
pip install rlm-runtime[all]
Claude Desktop / Claude Code Integration
RLM Runtime includes an MCP server that provides sandboxed Python execution to Claude. Zero API keys required - designed to work within Claude Code's billing.
Architecture
Claude Code (LLM + billing included)
│
├── rlm-runtime-mcp (code sandbox - no API keys)
│ ├── execute_python
│ ├── get_repl_context
│ ├── set_repl_context
│ └── clear_repl_context
│
└── snipara-mcp (optional, OAuth auth)
└── search_context
Setup
-
Install with MCP support:
pip install rlm-runtime[mcp]
-
Add to your Claude configuration:
Claude Desktop (
~/.claude/claude_desktop_config.json):{ "mcpServers": { "rlm": { "command": "rlm", "args": ["mcp-serve"] } } }
Claude Code (via MCP settings):
{ "mcpServers": { "rlm": { "command": "rlm", "args": ["mcp-serve"] } } }
-
Restart Claude Desktop or reload Claude Code.
Available MCP Tools
| Tool | Description |
|---|---|
execute_python |
Run Python code in a sandboxed environment |
get_repl_context |
Get current REPL context variables |
set_repl_context |
Set a variable in REPL context |
clear_repl_context |
Clear all REPL context |
With Snipara (Optional)
For context retrieval, add snipara-mcp alongside rlm-runtime:
{
"mcpServers": {
"rlm": {
"command": "rlm",
"args": ["mcp-serve"]
},
"snipara": {
"command": "snipara-mcp-server"
}
}
}
Authenticate with OAuth (no API key copying):
pip install snipara-mcp
snipara-mcp-login # Opens browser for authentication
snipara-mcp-status # Check auth status
See MCP Integration Guide for details.
Example Usage in Claude
Once configured, Claude can execute Python in a secure sandbox:
User: Calculate the fibonacci sequence up to n=20
Claude: I'll use the execute_python tool to calculate this.
[execute_python]
def fib(n):
a, b = 0, 1
result = []
while a <= n:
result.append(a)
a, b = b, a + b
return result
result = fib(20)
print(result)
Output: [0, 1, 1, 2, 3, 5, 8, 13]
Quick Start
CLI
# Initialize config
rlm init
# Run a completion
rlm run "Count the lines in data.csv and show the top 5 rows"
# Run with Docker isolation
rlm run --env docker "Parse the JSON files and extract all emails"
# Verbose mode (shows trajectory)
rlm run -v "Explain the authentication flow in this codebase"
Python API
import asyncio
from rlm import RLM
async def main():
# Basic usage
rlm = RLM(model="gpt-4o-mini", environment="local")
result = await rlm.completion("Analyze data.csv and find outliers")
print(result.response)
print(f"Calls: {result.total_calls}, Tokens: {result.total_tokens}")
asyncio.run(main())
With Snipara Context Optimization
from rlm import RLM
rlm = RLM(
model="gpt-4o-mini",
environment="docker",
snipara_api_key="rlm_...",
snipara_project_slug="my-project",
)
# Snipara tools auto-registered: context_query, sections, search, etc.
result = await rlm.completion("Explain how authentication works in this project")
Configuration
Create rlm.toml in your project:
[rlm]
backend = "litellm"
model = "gpt-4o-mini"
environment = "docker" # "local" or "docker"
max_depth = 4
max_subcalls = 12
token_budget = 8000
verbose = false
# Docker settings
docker_image = "python:3.11-slim"
docker_cpus = 1.0
docker_memory = "512m"
# Snipara (optional but recommended)
snipara_api_key = "rlm_..."
snipara_project_slug = "your-project"
Or use environment variables:
export RLM_MODEL=gpt-4o-mini
export RLM_ENVIRONMENT=docker
export SNIPARA_API_KEY=rlm_...
export SNIPARA_PROJECT_SLUG=my-project
Environments
Local REPL
- Fastest iteration
- Uses RestrictedPython for sandboxing
- Limited isolation (no network/filesystem by default)
- Best for trusted inputs in development
Docker REPL
- Strong isolation in containers
- Configurable resource limits (CPU, memory)
- Network disabled by default
- Recommended for production and untrusted inputs
# Start with Docker
rlm run --env docker "Process untrusted user input..."
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ RLM Orchestrator │
│ • Manages recursion depth and token budgets │
│ • Coordinates LLM calls and tool execution │
├─────────────────────────────────────────────────────────────────┤
│ LLM Backends │ REPL Environments │
│ • LiteLLM (default) │ • Local (RestrictedPython) │
│ • OpenAI │ • Docker (isolated) │
│ • Anthropic │ │
├─────────────────────────────────────────────────────────────────┤
│ Tool Registry │
│ • Builtin: file_read, execute_code │
│ • Snipara: context_query, sections, search (optional) │
│ • Custom: your own tools │
└─────────────────────────────────────────────────────────────────┘
Custom Tools
from rlm import RLM
from rlm.tools import Tool
async def fetch_weather(city: str) -> dict:
# Your implementation
return {"city": city, "temp": 72}
weather_tool = Tool(
name="get_weather",
description="Get current weather for a city",
parameters={
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
},
handler=fetch_weather,
)
rlm = RLM(tools=[weather_tool])
Trajectory Logging
All completions emit JSONL trajectory logs:
# View recent logs
rlm logs
# View specific trajectory
rlm logs abc123-def456
# Logs location
ls ./logs/
Each event includes:
trajectory_id- Unique ID for the requestcall_id- ID for this specific callparent_call_id- Parent call (for recursion)depth- Recursion depthprompt,response- Input/outputtool_calls,tool_results- Tool usagetoken_usage,duration_ms- Metrics
Trajectory Visualizer
Debug and analyze execution trajectories with the interactive web UI:
# Install visualizer dependencies
pip install rlm-runtime[visualizer]
# Launch the dashboard
rlm visualize
# Custom log directory and port
rlm visualize --dir ./logs --port 8502
The visualizer provides:
- Execution Tree - Visual graph of recursive calls
- Token Charts - Input/output token usage per call
- Duration Analysis - Timing breakdown across calls
- Tool Distribution - Pie chart of tool call frequency
- Event Inspector - Detailed view of each call with prompts/responses
API Reference
RLM Class
from rlm import RLM
rlm = RLM(
# Required
model="gpt-4o-mini", # LLM model identifier
# Backend
backend="litellm", # "litellm", "openai", or "anthropic"
api_key=None, # Provider API key (or use env vars)
# Execution Environment
environment="local", # "local" or "docker"
# Recursion Limits
max_depth=4, # Max recursive depth
max_subcalls=12, # Max total tool calls
token_budget=8000, # Token limit per completion
# Docker Settings (when environment="docker")
docker_image="python:3.11-slim",
docker_cpus=1.0,
docker_memory="512m",
docker_network_disabled=True,
# Tools
tools=None, # List of custom Tool objects
# Snipara Integration
snipara_api_key=None, # Snipara API key
snipara_project_slug=None, # Snipara project slug
snipara_api_url=None, # Custom API URL
# Logging
verbose=False, # Print execution details
log_dir="./logs", # Trajectory log directory
)
CompletionResult
result = await rlm.completion("Your prompt")
result.response # Final LLM response text
result.trajectory_id # Unique ID for this execution
result.total_calls # Total LLM calls made
result.total_tokens # Total tokens used
result.depth_reached # Max recursion depth reached
result.tool_calls # List of tool calls made
result.duration_ms # Total execution time
Tool Class
from rlm.backends.base import Tool
tool = Tool(
name="tool_name", # Unique tool identifier
description="What the tool does",
parameters={ # JSON Schema for parameters
"type": "object",
"properties": {
"param1": {"type": "string", "description": "..."},
},
"required": ["param1"]
},
handler=async_function, # Async function to execute
)
Error Handling
from rlm import RLM
from rlm.core.exceptions import (
RLMError, # Base exception
MaxDepthExceeded, # Recursion limit hit
TokenBudgetExhausted, # Token limit hit
REPLExecutionError, # Code execution failed
ToolNotFoundError, # Unknown tool called
)
try:
result = await rlm.completion("Complex task...")
except MaxDepthExceeded as e:
print(f"Hit max depth at {e.depth}")
except TokenBudgetExhausted as e:
print(f"Used {e.tokens_used} tokens, budget was {e.budget}")
except REPLExecutionError as e:
print(f"Code failed: {e.stderr}")
Advanced Examples
Recursive Data Analysis
from rlm import RLM
rlm = RLM(
model="claude-sonnet-4-20250514",
environment="docker",
max_depth=6,
)
# The LLM will recursively:
# 1. List CSV files
# 2. Read and analyze each one
# 3. Aggregate findings
# 4. Generate report
result = await rlm.completion("""
Analyze all CSV files in ./data/:
1. Find common columns across files
2. Calculate summary statistics for numeric columns
3. Identify any data quality issues
4. Generate a markdown report
""")
Code Generation with Context
from rlm import RLM
rlm = RLM(
model="gpt-4o",
snipara_api_key="rlm_...",
snipara_project_slug="my-app",
)
# The LLM will:
# 1. Query Snipara for auth patterns
# 2. Execute code to explore existing structure
# 3. Generate new code following conventions
result = await rlm.completion("""
Add a password reset endpoint:
- Follow our existing auth patterns
- Use the same error handling conventions
- Add tests following our test patterns
""")
# Access the code that was written
print(result.response)
Streaming Output
from rlm import RLM
rlm = RLM(model="gpt-4o-mini")
async for chunk in rlm.stream("Explain quantum computing"):
print(chunk, end="", flush=True)
Batch Processing
from rlm import RLM
import asyncio
rlm = RLM(model="gpt-4o-mini", environment="docker")
prompts = [
"Analyze report_q1.csv",
"Analyze report_q2.csv",
"Analyze report_q3.csv",
]
# Run in parallel
results = await asyncio.gather(*[
rlm.completion(p) for p in prompts
])
Why Snipara?
Without Snipara, RLM can only read files directly. With Snipara:
| Feature | Without Snipara | With Snipara |
|---|---|---|
| File reading | Basic read | Semantic search |
| Token usage | All content (500K) | Relevant only (5K) |
| Search | Regex only | Hybrid (keyword + embeddings) |
| Best practices | None | Shared team context |
| Summaries | None | Cached summaries |
Get your API key at snipara.com/dashboard
Development
# Clone
git clone https://github.com/alopez3006/rlm-runtime
cd rlm-runtime
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Lint
ruff check src/
mypy src/
# Build
python -m build
License
Apache 2.0 - See LICENSE
Documentation
- Quickstart Guide - Get started in 5 minutes
- Architecture Guide - System design and components
- MCP Integration - Claude Desktop/Code setup
- Configuration - All configuration options
- Tool Development - Building custom tools
Links
- Snipara - Context optimization service
- GitHub Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rlm_runtime-2.0.0.tar.gz.
File metadata
- Download URL: rlm_runtime-2.0.0.tar.gz
- Upload date:
- Size: 305.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30fb40e4bd416d082d116e4cf5ed3094e7ec01b2c155f55a79d8a58c6ec7a712
|
|
| MD5 |
b74335e4aae06dedabf5bca424009e10
|
|
| BLAKE2b-256 |
866fe3c834fc400e1c8964e17862096e0a94d430568452d93afb1d8491de7ed6
|
File details
Details for the file rlm_runtime-2.0.0-py3-none-any.whl.
File metadata
- Download URL: rlm_runtime-2.0.0-py3-none-any.whl
- Upload date:
- Size: 63.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a532f3727c04e2d030e590cf82360d73e5cc3be24ec98332cde0a68f937d1e5b
|
|
| MD5 |
8f8a0067cd321c9c23e5501c8d992ada
|
|
| BLAKE2b-256 |
da4742b502d12476f115a09221da0c77c39c1f9c7d8f8d19758c395013142cec
|