Recursive Language Models Toolkit for processing unlimited context
Project description
RLM-Toolkit
Recursive Language Models Toolkit โ A high-security LangChain alternative for processing unlimited context (10M+ tokens) using recursive LLM calls.
๐ Quick Start
pip install rlm-toolkit
from rlm_toolkit import RLM
# Simple usage with Ollama
rlm = RLM.from_ollama("llama3")
result = rlm.run(
context=open("large_document.txt").read(),
query="What are the key findings?"
)
print(result.answer)
โจ Features
| Feature | Description |
|---|---|
| Infinite Context | Process 10M+ tokens with O(1) memory |
| InfiniRetri | ๐ Attention-based retrieval, 100% accuracy on 1M+ tokens |
| H-MEM | ๐ 4-level hierarchical memory with LLM consolidation |
| Memory Bridge | ๐ Bi-temporal cross-session persistence (Graphiti-inspired) |
| Self-Evolving | ๐ LLMs that improve through usage (R-Zero pattern) |
| Multi-Agent | ๐ Decentralized P2P agents with Trust Zones |
| DSPy Optimization | ๐ Automatic prompt optimization |
| Secure REPL | CIRCLE-compliant sandboxed code execution |
| Multi-Provider | 75 LLM providers (OpenAI, Anthropic, Google, Ollama, vLLM...) |
| Document Loaders | 135+ sources (Slack, Jira, GitHub, S3, databases...) |
| Vector Stores | 20+ stores (Pinecone, Chroma, Weaviate, pgvector...) |
| Embeddings | 15+ providers (OpenAI, BGE, E5, Jina, Cohere...) |
| Cost Control | Budget limits, cost tracking |
| Observability | OpenTelemetry, Langfuse, LangSmith, W&B (12 backends) |
| Memory Systems | Buffer, Episodic, Hierarchical (H-MEM), Memory Bridge |
๐ Full Integration Catalog โ 287+ production-ready integrations
๐ฅ InfiniRetri (NEW)
Attention-based infinite context retrieval โ 100% accuracy on Needle-In-a-Haystack up to 1M+ tokens.
from rlm_toolkit.retrieval import InfiniRetriever
# Retrieve from 1M+ token documents
retriever = InfiniRetriever("Qwen/Qwen2.5-0.5B-Instruct")
answer = retriever.retrieve(
context=million_token_doc,
question="What is the secret code?"
)
# Or use automatic routing in RLM
from rlm_toolkit import RLM, RLMConfig
config = RLMConfig(
use_infiniretri=True,
infiniretri_threshold=100_000, # Auto-switch at 100K tokens
)
rlm = RLM.from_ollama("llama3", config=config)
result = rlm.run(huge_document, "Summarize") # Automatically uses InfiniRetri
Based on arXiv:2502.12962 โ requires
pip install infini-retri
๐ง Hierarchical Memory (H-MEM) (NEW)
Multi-level persistent memory with semantic consolidation โ memories that learn and evolve.
from rlm_toolkit.memory import HierarchicalMemory, SecureHierarchicalMemory
# Basic H-MEM
hmem = HierarchicalMemory()
hmem.add_episode("User asked about weather")
hmem.add_episode("AI responded with forecast")
hmem.consolidate() # Auto-creates traces, categories, domains
results = hmem.retrieve("weather")
# Secure H-MEM with encryption and trust zones
smem = SecureHierarchicalMemory(
agent_id="agent-001",
trust_zone="zone-secure"
)
smem.add_episode("Confidential data")
smem.grant_access("agent-002", "zone-secure")
4-Level Architecture:
Level 3: DOMAIN โ High-level knowledge
Level 2: CATEGORY โ Semantic categories
Level 1: TRACE โ Consolidated memories
Level 0: EPISODE โ Raw interactions
Based on arXiv H-MEM paper (July 2025)
๐ Memory Bridge v2.1 (NEW)
Enterprise-scale cross-session persistence โ Zero-friction Auto-Mode with 56x token compression.
# Zero-config enterprise context (recommended)
from rlm_toolkit.memory_bridge.mcp_tools_v2 import rlm_enterprise_context
result = rlm_enterprise_context(
query="What's the architecture of this project?",
max_tokens=3000
)
print(result["context"]) # Semantic routing loads only relevant facts
v2.1 Features:
| Feature | Description |
|---|---|
| Auto-Mode | ๐ Zero-config orchestration for new projects |
| Hierarchical Memory | ๐ L0-L3 levels: Project โ Domain โ Module โ Code |
| Semantic Routing | ๐ 56x compression via similarity-based context loading |
| Git Auto-Extract | ๐ Facts extracted automatically on each commit |
| Causal Reasoning | ๐ Track decisions with reasons, constraints, alternatives |
| Smart Cold Start | ๐ Sub-second project discovery (0.04s for 79K LOC) |
| 18 MCP Tools | Full IDE integration via Model Context Protocol |
Hierarchical Memory (L0-L3):
L0: PROJECT โ High-level architecture, tech stack
L1: DOMAIN โ Feature areas (auth, api, database)
L2: MODULE โ Per-file knowledge
L3: CODE โ Function-level facts with line refs
VS Code Extension v2.1.0:
code --install-extension rlm-toolkit-2.1.0.vsix
- Real-time dashboard with L0-L3 visualization
- Discover / Git Hook / Index Embeddings buttons
- Health Check status for Memory Store and Semantic Router
Git Hook Auto-Extraction:
# Install hook for automatic fact extraction
rlm_install_git_hooks(hook_type="post-commit")
# Now every commit auto-extracts: classes, functions, major changes
Based on Graphiti โ Full Documentation
๐งฌ Self-Evolving LLMs (NEW)
LLMs that improve reasoning through usage โ no human supervision required.
from rlm_toolkit.evolve import SelfEvolvingRLM, EvolutionStrategy
from rlm_toolkit.providers import OllamaProvider
# Create self-evolving RLM
evolve = SelfEvolvingRLM(
provider=OllamaProvider("llama3"),
strategy=EvolutionStrategy.CHALLENGER_SOLVER
)
# Solve with self-refinement
answer = evolve.solve("What is 25 * 17?")
print(f"Answer: {answer.answer}, Confidence: {answer.confidence}")
# Run training loop (generates challenges โ solves โ improves)
metrics = evolve.training_loop(iterations=10, domain="math")
print(f"Success rate: {metrics.success_rate}")
Strategies:
SELF_REFINEโ Iterative self-improvementCHALLENGER_SOLVERโ R-Zero co-evolutionary loopEXPERIENCE_REPLAYโ Learn from past solutions
Based on R-Zero (arXiv:2508.05004)
๐ค Multi-Agent Framework (NEW)
Decentralized P2P agents inspired by Meta Matrix โ no central orchestrator bottleneck.
from rlm_toolkit.agents import MultiAgentRuntime, SecureAgent, EvolvingAgent
# Create runtime
runtime = MultiAgentRuntime()
# Register agents with Trust Zones
runtime.register(SecureAgent("analyst", "Data Analyst", trust_zone="internal"))
runtime.register(EvolvingAgent("solver", "Problem Solver", llm_provider=provider))
# Run message through agents
from rlm_toolkit.agents import AgentMessage
message = AgentMessage(content="Analyze this data", routing=["analyst", "solver"])
result = runtime.run(message)
Agent Types:
SecureAgentโ H-MEM Trust Zones integrationEvolvingAgentโ Self-improving via R-ZeroSecureEvolvingAgentโ Both combined
Based on Meta Matrix (arXiv 2025)
๐ฏ DSPy-Style Optimization (NEW)
Automatic prompt optimization โ define what, not how.
from rlm_toolkit.optimize import Signature, Predict, ChainOfThought, BootstrapFewShot
# Define signature
sig = Signature(
inputs=["question", "context"],
outputs=["answer"],
instructions="Answer based on context"
)
# Use with Chain of Thought
cot = ChainOfThought(sig, provider)
result = cot(question="What is X?", context="X is 42")
# Auto-optimize with few-shot selection
optimizer = BootstrapFewShot(metric=lambda p, g: p["answer"] == g["answer"])
optimized = optimizer.compile(Predict(sig, provider), trainset=examples)
Modules: Predict, ChainOfThought, SelfRefine
Optimizers: BootstrapFewShot, PromptOptimizer
Inspired by Stanford DSPy
๐ฆ Installation
# Basic
pip install rlm-toolkit
# With all providers
pip install rlm-toolkit[all]
# Development
pip install -e ".[dev]"
๐ง Usage
Basic
from rlm_toolkit import RLM, RLMConfig
# With configuration
config = RLMConfig(
max_iterations=50,
max_cost=5.0, # USD
)
rlm = RLM.from_openai("gpt-4o", config=config)
result = rlm.run(context, query)
With Memory
from rlm_toolkit.memory import EpisodicMemory
memory = EpisodicMemory(max_entries=1000)
rlm = RLM.from_ollama("llama3", memory=memory)
# Memory persists across runs
result1 = rlm.run(doc1, "Summarize this")
result2 = rlm.run(doc2, "Compare with previous")
With Observability
from rlm_toolkit.observability import Tracer, CostTracker
tracer = Tracer(service_name="my-app")
cost_tracker = CostTracker(budget=10.0)
rlm = RLM.from_openai("gpt-4o", tracer=tracer, cost_tracker=cost_tracker)
๐ Security
RLM-Toolkit implements CIRCLE-compliant security with v1.2.1 hardening:
- AES-256-GCM โ Mandatory authenticated encryption for all persistent data
- Fail-Closed โ No XOR fallback; raises error if cryptography unavailable
- Rate Limiting โ 60s cooldown on reindex to prevent I/O exhaustion
- AST Analysis โ Block dangerous imports before execution
- Sandboxed REPL โ Isolated code execution with timeouts
- Virtual Filesystem โ Quota-enforced file operations
- Attack Detection โ Obfuscation and indirect attack patterns
from rlm_toolkit import RLMConfig, SecurityConfig
config = RLMConfig(
security=SecurityConfig(
sandbox=True,
max_execution_time=30.0,
max_memory_mb=512,
)
)
๐ Benchmarks
Based on RLM paper methodology:
| Benchmark | Score |
|---|---|
| OOLONG-Pairs | TBD |
| CIRCLE Security | ~95% |
๐ ๏ธ CLI
# Run a query
rlm run --model ollama:llama3 --context file.txt --query "Summarize"
# Interactive REPL
rlm repl --model openai:gpt-4o
# Cost tracking
rlm trace --session latest
๐ Documentation
v2.1.0: 162 files (81 EN + 81 RU) โ NIOKR 10/10
| Category | EN | RU |
|---|---|---|
| Concepts | 25 | 25 |
| Tutorials | 13 | 13 |
| Examples | 10 | 10 |
| How-To | 20 | 20 |
| Reference | 6 | 6 |
| Memory Bridge | 7 | 7 |
- Quickstart / ะัััััะน ััะฐัั
- Tutorials / ะขััะพัะธะฐะปั
- Security Guide
- Memory Bridge v2.1 โ Enterprise memory with 18 MCP tools
- MCP Server โ IDE integration
- VS Code Extension โ Dashboard v2.1.0
- Certification Checklist
- Examples
๐ค Contributing
# Clone repo
git clone https://github.com/DmitrL-dev/AISecurity.git
cd AISecurity/sentinel-community/rlm-toolkit
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Lint
ruff check rlm_toolkit/
๐ License
Apache 2.0 โ see LICENSE
๐ Acknowledgments
- Alex Zhang โ Original RLM concept author (Blog, arXiv:2512.24601, October 2025)
- Prime Intellect โ RLM research and verifiers implementation
- CIRCLE Benchmark โ Security evaluation methodology
- InfiniRetri โ Attention-based infinite context retrieval
- H-MEM โ Hierarchical memory architecture
- SENTINEL Community โ Security-first implementation and documentation
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rlm_toolkit-2.3.1.tar.gz.
File metadata
- Download URL: rlm_toolkit-2.3.1.tar.gz
- Upload date:
- Size: 310.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f028aa80d8dbd8184a74c3dd141737c0bcec2eac6bef478d479e918ed2a4466
|
|
| MD5 |
77280d5573bb1dd62d019ff34f36c997
|
|
| BLAKE2b-256 |
b3db8bc462774dbba9df57f0af20d461617ad65a7f6e36b86c42a2d0bfea50e6
|
File details
Details for the file rlm_toolkit-2.3.1-py3-none-any.whl.
File metadata
- Download URL: rlm_toolkit-2.3.1-py3-none-any.whl
- Upload date:
- Size: 379.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffd0c889c91fd2aed645fba6fa19dc43c6f9abb95ee58e1da984597e0dfa3b74
|
|
| MD5 |
21a29e3ed089781c3ad446fd5f4c48de
|
|
| BLAKE2b-256 |
ae6c08f4a0b4e5b871b915e568a01208f521fdf2697013d91f9d5eaab0366793
|