RLM (Recursive Language Model) plugin for elizaOS - enables processing of arbitrarily long contexts

These details have not been verified by PyPI

Project links

Project description

plugin-rlm

RLM (Recursive Language Model) plugin for elizaOS.

Overview

This plugin integrates Recursive Language Models (RLMs) into elizaOS, enabling LLMs to process arbitrarily long contexts through recursive self-calls in a REPL environment.

RLMs represent a breakthrough in long-context processing that:

Handles inputs up to 2 orders of magnitude beyond standard model context windows
Outperforms vanilla GPT-5 on long-context benchmarks using smaller models
Uses strategies like Peeking, Grepping, Partition+Map, and Summarization

Reference

Paper: Recursive Language Models (arXiv:2512.24601)
Authors: Alex L. Zhang, Tim Kraska, Omar Khattab (MIT CSAIL)
Official Implementation: github.com/alexzhang13/rlm

Installation

Python

cd plugins/plugin-rlm/python
pip install -e .

# Optional: Install RLM backend
pip install git+https://github.com/alexzhang13/rlm.git

TypeScript

cd plugins/plugin-rlm/typescript
npm install
npm run build

Rust

cd plugins/plugin-rlm/rust
cargo build

Configuration

The plugin reads configuration from environment variables:

Variable	Default	Description
`ELIZA_RLM_BACKEND`	`gemini`	LLM backend (`openai`, `anthropic`, `gemini`, `groq`, `openrouter`)
`ELIZA_RLM_ENV`	`local`	Execution environment (`local`, `docker`, `modal`, `prime`)
`ELIZA_RLM_MAX_ITERATIONS`	`4`	Maximum REPL iterations
`ELIZA_RLM_MAX_DEPTH`	`1`	Maximum recursion depth
`ELIZA_RLM_VERBOSE`	`false`	Enable verbose logging
`ELIZA_RLM_PYTHON_PATH`	`python`	Python executable path (TypeScript/Rust IPC)
`ELIZA_RLM_MAX_RETRIES`	`3`	Maximum retry attempts for transient failures
`ELIZA_RLM_RETRY_DELAY`	`1000`	Base delay (ms) between retries
`ELIZA_RLM_RETRY_MAX_DELAY`	`30000`	Maximum delay (ms) between retries

Retry Logic

The plugin includes automatic retry with exponential backoff for transient failures. Retries are triggered for:

Connection timeouts
Rate limit errors (429)
Temporary server errors (503)
Network connection failures

Assumptions and Limitations:

Retry detection uses substring matching on error messages (e.g., "timeout", "connection", "rate limit")
Different LLM backends may use different error message formats
Non-transient errors (e.g., authentication, validation) are NOT retried

Metrics (TypeScript)

The TypeScript client provides metrics for monitoring:

const client = new RLMClient();

// Get current metrics
const metrics = client.getMetrics();
console.log(metrics.totalRequests, metrics.successfulRequests);

// Register callback for real-time metrics
client.onMetrics((metrics) => {
  // Send to Prometheus, Datadog, etc.
  exporter.recordGauge('rlm_latency_p95', metrics.p95LatencyMs);
});

Available metrics:

totalRequests, successfulRequests, failedRequests, stubResponses
totalRetries - total retry attempts across all requests
averageLatencyMs, p95LatencyMs - latency percentiles (rolling 1000 samples)
lastRequestTimestamp, lastErrorTimestamp, lastError

Backend API Keys

Depending on your chosen backend, set the appropriate API key:

# For OpenAI backend
export OPENAI_API_KEY=sk-...

# For Anthropic backend
export ANTHROPIC_API_KEY=sk-ant-...

# For Google Gemini backend
export GEMINI_API_KEY=...

Usage

Python

from elizaos_plugin_rlm import plugin, RLMClient

# The plugin is auto-loaded by elizaOS runtime
# Or use the client directly:
client = RLMClient()
result = await client.infer("Process this very long text...")
print(result.text)

TypeScript

import { rlmPlugin, RLMClient } from "@elizaos/plugin-rlm";

// Plugin is auto-registered when loaded
// Or use client directly:
const client = new RLMClient();
const result = await client.infer("Process this very long text...");
console.log(result.text);

Rust

use elizaos_plugin_rlm::{RLMClient, RLMConfig};

let client = RLMClient::from_env()?;
let result = client.infer("Process this very long text...".into(), None).await;
println!("{}", result.text);

Model Types

The plugin registers handlers for the following model types:

Model Type	Description
`TEXT_SMALL`	Small text generation
`TEXT_LARGE`	Large text generation
`TEXT_REASONING_SMALL`	Small reasoning model
`TEXT_REASONING_LARGE`	Large reasoning model
`TEXT_COMPLETION`	Text completion
`TEXT_RLM_LARGE`	Explicit RLM mode
`TEXT_RLM_REASONING`	Explicit RLM with reasoning

Architecture

plugins/plugin-rlm/
├── README.md                    # This file
├── python/
│   ├── pyproject.toml           # Python package config
│   ├── elizaos_plugin_rlm/
│   │   ├── __init__.py          # Package exports
│   │   ├── client.py            # RLMClient wrapper
│   │   ├── plugin.py            # Plugin definition
│   │   └── server.py            # IPC server for TS/Rust
│   └── tests/
│       ├── test_plugin.py       # Unit tests
│       └── test_integration.py  # Integration tests
├── typescript/
│   ├── package.json
│   ├── types.ts                 # TypeScript types
│   ├── client.ts                # RLMClient (IPC to Python)
│   ├── index.ts                 # Plugin definition
│   └── __tests__/
│       ├── plugin.test.ts       # Unit tests
│       └── integration.test.ts  # Integration tests
└── rust/
    ├── Cargo.toml
    ├── src/
    │   ├── lib.rs               # Plugin + exports
    │   ├── client.rs            # RLMClient (IPC)
    │   ├── types.rs             # Rust types
    │   └── error.rs             # Error handling
    └── tests/
        └── integration_tests.rs # Rust tests

Cross-Language Design

Since the official RLM library is Python-only:

Python: Direct integration with the rlm library
TypeScript: IPC via Python subprocess with JSON-RPC protocol
Rust: IPC via Python subprocess with JSON-RPC protocol

All implementations fall back to stub mode when the RLM backend is unavailable, returning safe placeholder responses without throwing errors.

Stub Mode

When the RLM backend is not installed or unavailable, the plugin operates in stub mode:

# Stub response format
{
    "text": "[RLM STUB] RLM backend not available",
    "metadata": {
        "stub": true
    }
}

This allows the plugin to be loaded and used without the RLM dependency for testing and development purposes.

Testing

Python

cd plugins/plugin-rlm/python
pip install -e ".[dev]"
pytest tests/

TypeScript

cd plugins/plugin-rlm/typescript
npm install
npm test

Rust

cd plugins/plugin-rlm/rust
cargo test

How RLM Works

RLMs treat long prompts as external objects stored in a Python REPL environment. Instead of feeding the entire input to the model, the LM:

Peeks at portions of the context
Greps for relevant information using regex
Partitions the context into manageable chunks
Maps recursive LM calls over chunks
Summarizes and aggregates results

This approach addresses "context rot" - the degradation of model performance as prompt length increases.

Example: Long Context Processing

# Traditional approach (fails with long context)
response = llm.completion(very_long_context + query)

# RLM approach (handles arbitrarily long context)
response = rlm.completion(very_long_context + query)
# RLM internally:
# 1. Stores context in REPL environment
# 2. Peeks at structure
# 3. Greps for relevant portions
# 4. Recursively processes chunks
# 5. Aggregates final answer

Performance

Based on the original paper's benchmarks:

Benchmark	GPT-5	RLM(GPT-5-mini)	Improvement
OOLONG (132k)	30%	64%	+114%
OOLONG (263k)	31%	46%	+49%
BrowseComp-Plus	~60%	100%	+67%

RLM using GPT-5-mini outperforms vanilla GPT-5 while being cheaper per query.

Limitations

Latency: RLM calls take longer due to recursive processing
Cost: Multiple LLM calls may increase API costs
Python Dependency: TypeScript and Rust rely on Python IPC
No Streaming: Streaming is not yet supported in RLM

Contributing

Contributions are welcome! Please see the main elizaOS repository for contribution guidelines.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.0a5 pre-release

Mar 17, 2026

2.0.0a4 pre-release

Feb 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elizaos_plugin_rlm-2.0.0a5.tar.gz (40.7 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl (21.6 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file elizaos_plugin_rlm-2.0.0a5.tar.gz.

File metadata

Download URL: elizaos_plugin_rlm-2.0.0a5.tar.gz
Upload date: Mar 17, 2026
Size: 40.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for elizaos_plugin_rlm-2.0.0a5.tar.gz
Algorithm	Hash digest
SHA256	`a22b8434757740bf28c1e2da2e113e7fded842db61b5a124c842609c95b49a93`
MD5	`d17a96d43b08b9ab8503d444f338f8cf`
BLAKE2b-256	`7061b3b774fed4f6afe19b4cf73e7458271d4d94c25f77f6785292e95c20c8b1`

See more details on using hashes here.

File details

Details for the file elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl.

File metadata

Download URL: elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 21.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for elizaos_plugin_rlm-2.0.0a5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`96a59e370460a7de2462593a1785d9faf1fa1c3348f4bc64bd03a7906b09a252`
MD5	`b8101c9bb24ddd2da12eb502e5a4f184`
BLAKE2b-256	`264c10502f490f951da2729f6d98a03d07766c6d4dcfd4107ecaa75b2f351d81`

See more details on using hashes here.

elizaos-plugin-rlm 2.0.0a5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

plugin-rlm

Overview

Reference

Installation

Python

TypeScript

Rust

Configuration

Retry Logic

Metrics (TypeScript)

Backend API Keys

Usage

Python

TypeScript

Rust

Model Types

Architecture

Cross-Language Design

Stub Mode

Testing

Python

TypeScript

Rust

How RLM Works

Example: Long Context Processing

Performance

Limitations

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes