Skip to main content

Recursive Language Model runtime with sandboxed REPL execution

Project description

RLM Runtime

Recursive Language Model runtime with sandboxed REPL execution.

RLM Runtime enables LLMs to recursively decompose tasks, execute real code in isolated environments, and retrieve context on demand. Instead of simulating computation in tokens, the model executes actual code—cheaper, more reliable, and auditable.

Features

  • Recursive Completion - LLMs can spawn sub-calls, execute code, and aggregate results
  • Sandboxed REPL - Local (RestrictedPython) or Docker isolation
  • Multi-Provider - OpenAI, Anthropic, and 100+ providers via LiteLLM
  • Streaming - Real-time token streaming for simple completions
  • Trajectory Logging - Full execution traces in JSONL format
  • Trajectory Visualizer - Interactive Streamlit dashboard for debugging
  • MCP Server - Claude Desktop/Code integration with multi-project support
  • Plugin System - Extend with custom tools
  • Snipara Integration - Optional context optimization (recommended)

Installation

# Basic install
pip install rlm-runtime

# With Docker support (recommended)
pip install rlm-runtime[docker]

# With MCP server (for Claude Desktop/Code)
pip install rlm-runtime[mcp]

# With Snipara context optimization
pip install rlm-runtime[snipara]

# Full install
pip install rlm-runtime[all]

Claude Desktop / Claude Code Integration

RLM Runtime includes an MCP server that provides sandboxed Python execution to Claude. Zero API keys required - designed to work within Claude Code's billing.

Architecture

Claude Code (LLM + billing included)
    │
    ├── rlm-runtime-mcp (code sandbox - no API keys)
    │   ├── execute_python
    │   ├── get_repl_context
    │   ├── set_repl_context
    │   └── clear_repl_context
    │
    └── snipara-mcp (optional, OAuth auth)
        └── search_context

Setup

  1. Install with MCP support:

    pip install rlm-runtime[mcp]
    
  2. Add to your Claude configuration:

    Claude Desktop (~/.claude/claude_desktop_config.json):

    {
      "mcpServers": {
        "rlm": {
          "command": "rlm",
          "args": ["mcp-serve"]
        }
      }
    }
    

    Claude Code (via MCP settings):

    {
      "mcpServers": {
        "rlm": {
          "command": "rlm",
          "args": ["mcp-serve"]
        }
      }
    }
    
  3. Restart Claude Desktop or reload Claude Code.

Available MCP Tools

Tool Description
execute_python Run Python code in a sandboxed environment
get_repl_context Get current REPL context variables
set_repl_context Set a variable in REPL context
clear_repl_context Clear all REPL context

With Snipara (Optional)

For context retrieval, add snipara-mcp alongside rlm-runtime:

{
  "mcpServers": {
    "rlm": {
      "command": "rlm",
      "args": ["mcp-serve"]
    },
    "snipara": {
      "command": "snipara-mcp-server"
    }
  }
}

Authenticate with OAuth (no API key copying):

pip install snipara-mcp
snipara-mcp-login      # Opens browser for authentication
snipara-mcp-status     # Check auth status

See MCP Integration Guide for details.

Example Usage in Claude

Once configured, Claude can execute Python in a secure sandbox:

User: Calculate the fibonacci sequence up to n=20

Claude: I'll use the execute_python tool to calculate this.

[execute_python]
def fib(n):
    a, b = 0, 1
    result = []
    while a <= n:
        result.append(a)
        a, b = b, a + b
    return result

result = fib(20)
print(result)

Output: [0, 1, 1, 2, 3, 5, 8, 13]

Quick Start

CLI

# Initialize config
rlm init

# Run a completion
rlm run "Count the lines in data.csv and show the top 5 rows"

# Run with Docker isolation
rlm run --env docker "Parse the JSON files and extract all emails"

# Verbose mode (shows trajectory)
rlm run -v "Explain the authentication flow in this codebase"

Python API

import asyncio
from rlm import RLM

async def main():
    # Basic usage
    rlm = RLM(model="gpt-4o-mini", environment="local")

    result = await rlm.completion("Analyze data.csv and find outliers")
    print(result.response)
    print(f"Calls: {result.total_calls}, Tokens: {result.total_tokens}")

asyncio.run(main())

With Snipara Context Optimization

from rlm import RLM

rlm = RLM(
    model="gpt-4o-mini",
    environment="docker",
    snipara_api_key="rlm_...",
    snipara_project_slug="my-project",
)

# Snipara tools auto-registered: context_query, sections, search, etc.
result = await rlm.completion("Explain how authentication works in this project")

Configuration

Create rlm.toml in your project:

[rlm]
backend = "litellm"
model = "gpt-4o-mini"
environment = "docker"  # "local" or "docker"
max_depth = 4
max_subcalls = 12
token_budget = 8000
verbose = false

# Docker settings
docker_image = "python:3.11-slim"
docker_cpus = 1.0
docker_memory = "512m"

# Snipara (optional but recommended)
snipara_api_key = "rlm_..."
snipara_project_slug = "your-project"

Or use environment variables:

export RLM_MODEL=gpt-4o-mini
export RLM_ENVIRONMENT=docker
export SNIPARA_API_KEY=rlm_...
export SNIPARA_PROJECT_SLUG=my-project

Environments

Local REPL

  • Fastest iteration
  • Uses RestrictedPython for sandboxing
  • Limited isolation (no network/filesystem by default)
  • Best for trusted inputs in development

Docker REPL

  • Strong isolation in containers
  • Configurable resource limits (CPU, memory)
  • Network disabled by default
  • Recommended for production and untrusted inputs
# Start with Docker
rlm run --env docker "Process untrusted user input..."

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  RLM Orchestrator                                               │
│  • Manages recursion depth and token budgets                    │
│  • Coordinates LLM calls and tool execution                     │
├─────────────────────────────────────────────────────────────────┤
│  LLM Backends              │  REPL Environments                 │
│  • LiteLLM (default)       │  • Local (RestrictedPython)        │
│  • OpenAI                  │  • Docker (isolated)               │
│  • Anthropic               │                                    │
├─────────────────────────────────────────────────────────────────┤
│  Tool Registry                                                  │
│  • Builtin: file_read, execute_code                            │
│  • Snipara: context_query, sections, search (optional)         │
│  • Custom: your own tools                                       │
└─────────────────────────────────────────────────────────────────┘

Custom Tools

from rlm import RLM
from rlm.tools import Tool

async def fetch_weather(city: str) -> dict:
    # Your implementation
    return {"city": city, "temp": 72}

weather_tool = Tool(
    name="get_weather",
    description="Get current weather for a city",
    parameters={
        "type": "object",
        "properties": {
            "city": {"type": "string", "description": "City name"}
        },
        "required": ["city"]
    },
    handler=fetch_weather,
)

rlm = RLM(tools=[weather_tool])

Trajectory Logging

All completions emit JSONL trajectory logs:

# View recent logs
rlm logs

# View specific trajectory
rlm logs abc123-def456

# Logs location
ls ./logs/

Each event includes:

  • trajectory_id - Unique ID for the request
  • call_id - ID for this specific call
  • parent_call_id - Parent call (for recursion)
  • depth - Recursion depth
  • prompt, response - Input/output
  • tool_calls, tool_results - Tool usage
  • token_usage, duration_ms - Metrics

Trajectory Visualizer

Debug and analyze execution trajectories with the interactive web UI:

# Install visualizer dependencies
pip install rlm-runtime[visualizer]

# Launch the dashboard
rlm visualize

# Custom log directory and port
rlm visualize --dir ./logs --port 8502

The visualizer provides:

  • Execution Tree - Visual graph of recursive calls
  • Token Charts - Input/output token usage per call
  • Duration Analysis - Timing breakdown across calls
  • Tool Distribution - Pie chart of tool call frequency
  • Event Inspector - Detailed view of each call with prompts/responses

API Reference

RLM Class

from rlm import RLM

rlm = RLM(
    # Required
    model="gpt-4o-mini",           # LLM model identifier

    # Backend
    backend="litellm",             # "litellm", "openai", or "anthropic"
    api_key=None,                  # Provider API key (or use env vars)

    # Execution Environment
    environment="local",           # "local" or "docker"

    # Recursion Limits
    max_depth=4,                   # Max recursive depth
    max_subcalls=12,               # Max total tool calls
    token_budget=8000,             # Token limit per completion

    # Docker Settings (when environment="docker")
    docker_image="python:3.11-slim",
    docker_cpus=1.0,
    docker_memory="512m",
    docker_network_disabled=True,

    # Tools
    tools=None,                    # List of custom Tool objects

    # Snipara Integration
    snipara_api_key=None,          # Snipara API key
    snipara_project_slug=None,     # Snipara project slug
    snipara_api_url=None,          # Custom API URL

    # Logging
    verbose=False,                 # Print execution details
    log_dir="./logs",              # Trajectory log directory
)

CompletionResult

result = await rlm.completion("Your prompt")

result.response          # Final LLM response text
result.trajectory_id     # Unique ID for this execution
result.total_calls       # Total LLM calls made
result.total_tokens      # Total tokens used
result.depth_reached     # Max recursion depth reached
result.tool_calls        # List of tool calls made
result.duration_ms       # Total execution time

Tool Class

from rlm.backends.base import Tool

tool = Tool(
    name="tool_name",              # Unique tool identifier
    description="What the tool does",
    parameters={                   # JSON Schema for parameters
        "type": "object",
        "properties": {
            "param1": {"type": "string", "description": "..."},
        },
        "required": ["param1"]
    },
    handler=async_function,        # Async function to execute
)

Error Handling

from rlm import RLM
from rlm.core.exceptions import (
    RLMError,              # Base exception
    MaxDepthExceeded,      # Recursion limit hit
    TokenBudgetExhausted,  # Token limit hit
    REPLExecutionError,    # Code execution failed
    ToolNotFoundError,     # Unknown tool called
)

try:
    result = await rlm.completion("Complex task...")
except MaxDepthExceeded as e:
    print(f"Hit max depth at {e.depth}")
except TokenBudgetExhausted as e:
    print(f"Used {e.tokens_used} tokens, budget was {e.budget}")
except REPLExecutionError as e:
    print(f"Code failed: {e.stderr}")

Advanced Examples

Recursive Data Analysis

from rlm import RLM

rlm = RLM(
    model="claude-sonnet-4-20250514",
    environment="docker",
    max_depth=6,
)

# The LLM will recursively:
# 1. List CSV files
# 2. Read and analyze each one
# 3. Aggregate findings
# 4. Generate report
result = await rlm.completion("""
    Analyze all CSV files in ./data/:
    1. Find common columns across files
    2. Calculate summary statistics for numeric columns
    3. Identify any data quality issues
    4. Generate a markdown report
""")

Code Generation with Context

from rlm import RLM

rlm = RLM(
    model="gpt-4o",
    snipara_api_key="rlm_...",
    snipara_project_slug="my-app",
)

# The LLM will:
# 1. Query Snipara for auth patterns
# 2. Execute code to explore existing structure
# 3. Generate new code following conventions
result = await rlm.completion("""
    Add a password reset endpoint:
    - Follow our existing auth patterns
    - Use the same error handling conventions
    - Add tests following our test patterns
""")

# Access the code that was written
print(result.response)

Streaming Output

from rlm import RLM

rlm = RLM(model="gpt-4o-mini")

async for chunk in rlm.stream("Explain quantum computing"):
    print(chunk, end="", flush=True)

Batch Processing

from rlm import RLM
import asyncio

rlm = RLM(model="gpt-4o-mini", environment="docker")

prompts = [
    "Analyze report_q1.csv",
    "Analyze report_q2.csv",
    "Analyze report_q3.csv",
]

# Run in parallel
results = await asyncio.gather(*[
    rlm.completion(p) for p in prompts
])

Why Snipara?

Without Snipara, RLM can only read files directly. With Snipara:

Feature Without Snipara With Snipara
File reading Basic read Semantic search
Token usage All content (500K) Relevant only (5K)
Search Regex only Hybrid (keyword + embeddings)
Best practices None Shared team context
Summaries None Cached summaries

Get your API key at snipara.com/dashboard

Development

# Clone
git clone https://github.com/alopez3006/rlm-runtime
cd rlm-runtime

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Lint
ruff check src/
mypy src/

# Build
python -m build

License

Apache 2.0 - See LICENSE

Documentation

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlm_runtime-2.0.0.tar.gz (305.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rlm_runtime-2.0.0-py3-none-any.whl (63.9 kB view details)

Uploaded Python 3

File details

Details for the file rlm_runtime-2.0.0.tar.gz.

File metadata

  • Download URL: rlm_runtime-2.0.0.tar.gz
  • Upload date:
  • Size: 305.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.1

File hashes

Hashes for rlm_runtime-2.0.0.tar.gz
Algorithm Hash digest
SHA256 30fb40e4bd416d082d116e4cf5ed3094e7ec01b2c155f55a79d8a58c6ec7a712
MD5 b74335e4aae06dedabf5bca424009e10
BLAKE2b-256 866fe3c834fc400e1c8964e17862096e0a94d430568452d93afb1d8491de7ed6

See more details on using hashes here.

File details

Details for the file rlm_runtime-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: rlm_runtime-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 63.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.1

File hashes

Hashes for rlm_runtime-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a532f3727c04e2d030e590cf82360d73e5cc3be24ec98332cde0a68f937d1e5b
MD5 8f8a0067cd321c9c23e5501c8d992ada
BLAKE2b-256 da4742b502d12476f115a09221da0c77c39c1f9c7d8f8d19758c395013142cec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page