RLM (Recursive Language Model) plugin for elizaOS - enables processing of arbitrarily long contexts
Project description
plugin-rlm
RLM (Recursive Language Model) plugin for elizaOS.
Overview
This plugin integrates Recursive Language Models (RLMs) into elizaOS, enabling LLMs to process arbitrarily long contexts through recursive self-calls in a REPL environment.
RLMs represent a breakthrough in long-context processing that:
- Handles inputs up to 2 orders of magnitude beyond standard model context windows
- Outperforms vanilla GPT-5 on long-context benchmarks using smaller models
- Uses strategies like Peeking, Grepping, Partition+Map, and Summarization
Reference
- Paper: Recursive Language Models (arXiv:2512.24601)
- Authors: Alex L. Zhang, Tim Kraska, Omar Khattab (MIT CSAIL)
- Official Implementation: github.com/alexzhang13/rlm
Installation
Python
cd plugins/plugin-rlm/python
pip install -e .
# Optional: Install RLM backend
pip install git+https://github.com/alexzhang13/rlm.git
TypeScript
cd plugins/plugin-rlm/typescript
npm install
npm run build
Rust
cd plugins/plugin-rlm/rust
cargo build
Configuration
The plugin reads configuration from environment variables:
| Variable | Default | Description |
|---|---|---|
ELIZA_RLM_BACKEND |
gemini |
LLM backend (openai, anthropic, gemini, groq, openrouter) |
ELIZA_RLM_ENV |
local |
Execution environment (local, docker, modal, prime) |
ELIZA_RLM_MAX_ITERATIONS |
4 |
Maximum REPL iterations |
ELIZA_RLM_MAX_DEPTH |
1 |
Maximum recursion depth |
ELIZA_RLM_VERBOSE |
false |
Enable verbose logging |
ELIZA_RLM_PYTHON_PATH |
python |
Python executable path (TypeScript/Rust IPC) |
ELIZA_RLM_MAX_RETRIES |
3 |
Maximum retry attempts for transient failures |
ELIZA_RLM_RETRY_DELAY |
1000 |
Base delay (ms) between retries |
ELIZA_RLM_RETRY_MAX_DELAY |
30000 |
Maximum delay (ms) between retries |
Retry Logic
The plugin includes automatic retry with exponential backoff for transient failures. Retries are triggered for:
- Connection timeouts
- Rate limit errors (429)
- Temporary server errors (503)
- Network connection failures
Assumptions and Limitations:
- Retry detection uses substring matching on error messages (e.g., "timeout", "connection", "rate limit")
- Different LLM backends may use different error message formats
- Non-transient errors (e.g., authentication, validation) are NOT retried
Metrics (TypeScript)
The TypeScript client provides metrics for monitoring:
const client = new RLMClient();
// Get current metrics
const metrics = client.getMetrics();
console.log(metrics.totalRequests, metrics.successfulRequests);
// Register callback for real-time metrics
client.onMetrics((metrics) => {
// Send to Prometheus, Datadog, etc.
exporter.recordGauge('rlm_latency_p95', metrics.p95LatencyMs);
});
Available metrics:
totalRequests,successfulRequests,failedRequests,stubResponsestotalRetries- total retry attempts across all requestsaverageLatencyMs,p95LatencyMs- latency percentiles (rolling 1000 samples)lastRequestTimestamp,lastErrorTimestamp,lastError
Backend API Keys
Depending on your chosen backend, set the appropriate API key:
# For OpenAI backend
export OPENAI_API_KEY=sk-...
# For Anthropic backend
export ANTHROPIC_API_KEY=sk-ant-...
# For Google Gemini backend
export GEMINI_API_KEY=...
Usage
Python
from elizaos_plugin_rlm import plugin, RLMClient
# The plugin is auto-loaded by elizaOS runtime
# Or use the client directly:
client = RLMClient()
result = await client.infer("Process this very long text...")
print(result.text)
TypeScript
import { rlmPlugin, RLMClient } from "@elizaos/plugin-rlm";
// Plugin is auto-registered when loaded
// Or use client directly:
const client = new RLMClient();
const result = await client.infer("Process this very long text...");
console.log(result.text);
Rust
use elizaos_plugin_rlm::{RLMClient, RLMConfig};
let client = RLMClient::from_env()?;
let result = client.infer("Process this very long text...".into(), None).await;
println!("{}", result.text);
Model Types
The plugin registers handlers for the following model types:
| Model Type | Description |
|---|---|
TEXT_SMALL |
Small text generation |
TEXT_LARGE |
Large text generation |
TEXT_REASONING_SMALL |
Small reasoning model |
TEXT_REASONING_LARGE |
Large reasoning model |
TEXT_COMPLETION |
Text completion |
TEXT_RLM_LARGE |
Explicit RLM mode |
TEXT_RLM_REASONING |
Explicit RLM with reasoning |
Architecture
plugins/plugin-rlm/
├── README.md # This file
├── python/
│ ├── pyproject.toml # Python package config
│ ├── elizaos_plugin_rlm/
│ │ ├── __init__.py # Package exports
│ │ ├── client.py # RLMClient wrapper
│ │ ├── plugin.py # Plugin definition
│ │ └── server.py # IPC server for TS/Rust
│ └── tests/
│ ├── test_plugin.py # Unit tests
│ └── test_integration.py # Integration tests
├── typescript/
│ ├── package.json
│ ├── types.ts # TypeScript types
│ ├── client.ts # RLMClient (IPC to Python)
│ ├── index.ts # Plugin definition
│ └── __tests__/
│ ├── plugin.test.ts # Unit tests
│ └── integration.test.ts # Integration tests
└── rust/
├── Cargo.toml
├── src/
│ ├── lib.rs # Plugin + exports
│ ├── client.rs # RLMClient (IPC)
│ ├── types.rs # Rust types
│ └── error.rs # Error handling
└── tests/
└── integration_tests.rs # Rust tests
Cross-Language Design
Since the official RLM library is Python-only:
- Python: Direct integration with the
rlmlibrary - TypeScript: IPC via Python subprocess with JSON-RPC protocol
- Rust: IPC via Python subprocess with JSON-RPC protocol
All implementations fall back to stub mode when the RLM backend is unavailable, returning safe placeholder responses without throwing errors.
Stub Mode
When the RLM backend is not installed or unavailable, the plugin operates in stub mode:
# Stub response format
{
"text": "[RLM STUB] RLM backend not available",
"metadata": {
"stub": true
}
}
This allows the plugin to be loaded and used without the RLM dependency for testing and development purposes.
Testing
Python
cd plugins/plugin-rlm/python
pip install -e ".[dev]"
pytest tests/
TypeScript
cd plugins/plugin-rlm/typescript
npm install
npm test
Rust
cd plugins/plugin-rlm/rust
cargo test
How RLM Works
RLMs treat long prompts as external objects stored in a Python REPL environment. Instead of feeding the entire input to the model, the LM:
- Peeks at portions of the context
- Greps for relevant information using regex
- Partitions the context into manageable chunks
- Maps recursive LM calls over chunks
- Summarizes and aggregates results
This approach addresses "context rot" - the degradation of model performance as prompt length increases.
Example: Long Context Processing
# Traditional approach (fails with long context)
response = llm.completion(very_long_context + query)
# RLM approach (handles arbitrarily long context)
response = rlm.completion(very_long_context + query)
# RLM internally:
# 1. Stores context in REPL environment
# 2. Peeks at structure
# 3. Greps for relevant portions
# 4. Recursively processes chunks
# 5. Aggregates final answer
Performance
Based on the original paper's benchmarks:
| Benchmark | GPT-5 | RLM(GPT-5-mini) | Improvement |
|---|---|---|---|
| OOLONG (132k) | 30% | 64% | +114% |
| OOLONG (263k) | 31% | 46% | +49% |
| BrowseComp-Plus | ~60% | 100% | +67% |
RLM using GPT-5-mini outperforms vanilla GPT-5 while being cheaper per query.
Limitations
- Latency: RLM calls take longer due to recursive processing
- Cost: Multiple LLM calls may increase API costs
- Python Dependency: TypeScript and Rust rely on Python IPC
- No Streaming: Streaming is not yet supported in RLM
Contributing
Contributions are welcome! Please see the main elizaOS repository for contribution guidelines.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file elizaos_plugin_rlm-2.0.0a5.tar.gz.
File metadata
- Download URL: elizaos_plugin_rlm-2.0.0a5.tar.gz
- Upload date:
- Size: 40.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a22b8434757740bf28c1e2da2e113e7fded842db61b5a124c842609c95b49a93
|
|
| MD5 |
d17a96d43b08b9ab8503d444f338f8cf
|
|
| BLAKE2b-256 |
7061b3b774fed4f6afe19b4cf73e7458271d4d94c25f77f6785292e95c20c8b1
|
File details
Details for the file elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl.
File metadata
- Download URL: elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl
- Upload date:
- Size: 21.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96a59e370460a7de2462593a1785d9faf1fa1c3348f4bc64bd03a7906b09a252
|
|
| MD5 |
b8101c9bb24ddd2da12eb502e5a4f184
|
|
| BLAKE2b-256 |
264c10502f490f951da2729f6d98a03d07766c6d4dcfd4107ecaa75b2f351d81
|