Skip to main content

RLM (Recursive Language Model) plugin for elizaOS - enables processing of arbitrarily long contexts

Project description

plugin-rlm

RLM (Recursive Language Model) plugin for elizaOS.

Overview

This plugin integrates Recursive Language Models (RLMs) into elizaOS, enabling LLMs to process arbitrarily long contexts through recursive self-calls in a REPL environment.

RLMs represent a breakthrough in long-context processing that:

  • Handles inputs up to 2 orders of magnitude beyond standard model context windows
  • Outperforms vanilla GPT-5 on long-context benchmarks using smaller models
  • Uses strategies like Peeking, Grepping, Partition+Map, and Summarization

Reference

Installation

Python

cd plugins/plugin-rlm/python
pip install -e .

# Optional: Install RLM backend
pip install git+https://github.com/alexzhang13/rlm.git

TypeScript

cd plugins/plugin-rlm/typescript
npm install
npm run build

Rust

cd plugins/plugin-rlm/rust
cargo build

Configuration

The plugin reads configuration from environment variables:

Variable Default Description
ELIZA_RLM_BACKEND gemini LLM backend (openai, anthropic, gemini, groq, openrouter)
ELIZA_RLM_ENV local Execution environment (local, docker, modal, prime)
ELIZA_RLM_MAX_ITERATIONS 4 Maximum REPL iterations
ELIZA_RLM_MAX_DEPTH 1 Maximum recursion depth
ELIZA_RLM_VERBOSE false Enable verbose logging
ELIZA_RLM_PYTHON_PATH python Python executable path (TypeScript/Rust IPC)
ELIZA_RLM_MAX_RETRIES 3 Maximum retry attempts for transient failures
ELIZA_RLM_RETRY_DELAY 1000 Base delay (ms) between retries
ELIZA_RLM_RETRY_MAX_DELAY 30000 Maximum delay (ms) between retries

Retry Logic

The plugin includes automatic retry with exponential backoff for transient failures. Retries are triggered for:

  • Connection timeouts
  • Rate limit errors (429)
  • Temporary server errors (503)
  • Network connection failures

Assumptions and Limitations:

  • Retry detection uses substring matching on error messages (e.g., "timeout", "connection", "rate limit")
  • Different LLM backends may use different error message formats
  • Non-transient errors (e.g., authentication, validation) are NOT retried

Metrics (TypeScript)

The TypeScript client provides metrics for monitoring:

const client = new RLMClient();

// Get current metrics
const metrics = client.getMetrics();
console.log(metrics.totalRequests, metrics.successfulRequests);

// Register callback for real-time metrics
client.onMetrics((metrics) => {
  // Send to Prometheus, Datadog, etc.
  exporter.recordGauge('rlm_latency_p95', metrics.p95LatencyMs);
});

Available metrics:

  • totalRequests, successfulRequests, failedRequests, stubResponses
  • totalRetries - total retry attempts across all requests
  • averageLatencyMs, p95LatencyMs - latency percentiles (rolling 1000 samples)
  • lastRequestTimestamp, lastErrorTimestamp, lastError

Backend API Keys

Depending on your chosen backend, set the appropriate API key:

# For OpenAI backend
export OPENAI_API_KEY=sk-...

# For Anthropic backend
export ANTHROPIC_API_KEY=sk-ant-...

# For Google Gemini backend
export GEMINI_API_KEY=...

Usage

Python

from elizaos_plugin_rlm import plugin, RLMClient

# The plugin is auto-loaded by elizaOS runtime
# Or use the client directly:
client = RLMClient()
result = await client.infer("Process this very long text...")
print(result.text)

TypeScript

import { rlmPlugin, RLMClient } from "@elizaos/plugin-rlm";

// Plugin is auto-registered when loaded
// Or use client directly:
const client = new RLMClient();
const result = await client.infer("Process this very long text...");
console.log(result.text);

Rust

use elizaos_plugin_rlm::{RLMClient, RLMConfig};

let client = RLMClient::from_env()?;
let result = client.infer("Process this very long text...".into(), None).await;
println!("{}", result.text);

Model Types

The plugin registers handlers for the following model types:

Model Type Description
TEXT_SMALL Small text generation
TEXT_LARGE Large text generation
TEXT_REASONING_SMALL Small reasoning model
TEXT_REASONING_LARGE Large reasoning model
TEXT_COMPLETION Text completion
TEXT_RLM_LARGE Explicit RLM mode
TEXT_RLM_REASONING Explicit RLM with reasoning

Architecture

plugins/plugin-rlm/
├── README.md                    # This file
├── python/
│   ├── pyproject.toml           # Python package config
│   ├── elizaos_plugin_rlm/
│   │   ├── __init__.py          # Package exports
│   │   ├── client.py            # RLMClient wrapper
│   │   ├── plugin.py            # Plugin definition
│   │   └── server.py            # IPC server for TS/Rust
│   └── tests/
│       ├── test_plugin.py       # Unit tests
│       └── test_integration.py  # Integration tests
├── typescript/
│   ├── package.json
│   ├── types.ts                 # TypeScript types
│   ├── client.ts                # RLMClient (IPC to Python)
│   ├── index.ts                 # Plugin definition
│   └── __tests__/
│       ├── plugin.test.ts       # Unit tests
│       └── integration.test.ts  # Integration tests
└── rust/
    ├── Cargo.toml
    ├── src/
    │   ├── lib.rs               # Plugin + exports
    │   ├── client.rs            # RLMClient (IPC)
    │   ├── types.rs             # Rust types
    │   └── error.rs             # Error handling
    └── tests/
        └── integration_tests.rs # Rust tests

Cross-Language Design

Since the official RLM library is Python-only:

  1. Python: Direct integration with the rlm library
  2. TypeScript: IPC via Python subprocess with JSON-RPC protocol
  3. Rust: IPC via Python subprocess with JSON-RPC protocol

All implementations fall back to stub mode when the RLM backend is unavailable, returning safe placeholder responses without throwing errors.

Stub Mode

When the RLM backend is not installed or unavailable, the plugin operates in stub mode:

# Stub response format
{
    "text": "[RLM STUB] RLM backend not available",
    "metadata": {
        "stub": true
    }
}

This allows the plugin to be loaded and used without the RLM dependency for testing and development purposes.

Testing

Python

cd plugins/plugin-rlm/python
pip install -e ".[dev]"
pytest tests/

TypeScript

cd plugins/plugin-rlm/typescript
npm install
npm test

Rust

cd plugins/plugin-rlm/rust
cargo test

How RLM Works

RLMs treat long prompts as external objects stored in a Python REPL environment. Instead of feeding the entire input to the model, the LM:

  1. Peeks at portions of the context
  2. Greps for relevant information using regex
  3. Partitions the context into manageable chunks
  4. Maps recursive LM calls over chunks
  5. Summarizes and aggregates results

This approach addresses "context rot" - the degradation of model performance as prompt length increases.

Example: Long Context Processing

# Traditional approach (fails with long context)
response = llm.completion(very_long_context + query)

# RLM approach (handles arbitrarily long context)
response = rlm.completion(very_long_context + query)
# RLM internally:
# 1. Stores context in REPL environment
# 2. Peeks at structure
# 3. Greps for relevant portions
# 4. Recursively processes chunks
# 5. Aggregates final answer

Performance

Based on the original paper's benchmarks:

Benchmark GPT-5 RLM(GPT-5-mini) Improvement
OOLONG (132k) 30% 64% +114%
OOLONG (263k) 31% 46% +49%
BrowseComp-Plus ~60% 100% +67%

RLM using GPT-5-mini outperforms vanilla GPT-5 while being cheaper per query.

Limitations

  • Latency: RLM calls take longer due to recursive processing
  • Cost: Multiple LLM calls may increase API costs
  • Python Dependency: TypeScript and Rust rely on Python IPC
  • No Streaming: Streaming is not yet supported in RLM

Contributing

Contributions are welcome! Please see the main elizaOS repository for contribution guidelines.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elizaos_plugin_rlm-2.0.0a5.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file elizaos_plugin_rlm-2.0.0a5.tar.gz.

File metadata

  • Download URL: elizaos_plugin_rlm-2.0.0a5.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for elizaos_plugin_rlm-2.0.0a5.tar.gz
Algorithm Hash digest
SHA256 a22b8434757740bf28c1e2da2e113e7fded842db61b5a124c842609c95b49a93
MD5 d17a96d43b08b9ab8503d444f338f8cf
BLAKE2b-256 7061b3b774fed4f6afe19b4cf73e7458271d4d94c25f77f6785292e95c20c8b1

See more details on using hashes here.

File details

Details for the file elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl.

File metadata

File hashes

Hashes for elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl
Algorithm Hash digest
SHA256 96a59e370460a7de2462593a1785d9faf1fa1c3348f4bc64bd03a7906b09a252
MD5 b8101c9bb24ddd2da12eb502e5a4f184
BLAKE2b-256 264c10502f490f951da2729f6d98a03d07766c6d4dcfd4107ecaa75b2f351d81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page