Skip to main content

Recursive Language Models Toolkit for processing unlimited context

Project description

RLM-Toolkit

Version CI PyPI Python License Tests Docs NIOKR Integrations

Recursive Language Models Toolkit โ€” A high-security LangChain alternative for processing unlimited context (10M+ tokens) using recursive LLM calls.

๐Ÿš€ Quick Start

pip install rlm-toolkit
from rlm_toolkit import RLM

# Simple usage with Ollama
rlm = RLM.from_ollama("llama3")
result = rlm.run(
    context=open("large_document.txt").read(),
    query="What are the key findings?"
)
print(result.answer)

โœจ Features

Feature Description
Infinite Context Process 10M+ tokens with O(1) memory
InfiniRetri ๐Ÿ†• Attention-based retrieval, 100% accuracy on 1M+ tokens
H-MEM ๐Ÿ†• 4-level hierarchical memory with LLM consolidation
Memory Bridge ๐Ÿ†• Bi-temporal cross-session persistence (Graphiti-inspired)
Self-Evolving ๐Ÿ†• LLMs that improve through usage (R-Zero pattern)
Multi-Agent ๐Ÿ†• Decentralized P2P agents with Trust Zones
DSPy Optimization ๐Ÿ†• Automatic prompt optimization
Secure REPL CIRCLE-compliant sandboxed code execution
Multi-Provider 75 LLM providers (OpenAI, Anthropic, Google, Ollama, vLLM...)
Document Loaders 135+ sources (Slack, Jira, GitHub, S3, databases...)
Vector Stores 20+ stores (Pinecone, Chroma, Weaviate, pgvector...)
Embeddings 15+ providers (OpenAI, BGE, E5, Jina, Cohere...)
Cost Control Budget limits, cost tracking
Observability OpenTelemetry, Langfuse, LangSmith, W&B (12 backends)
Memory Systems Buffer, Episodic, Hierarchical (H-MEM), Memory Bridge

๐Ÿ“‹ Full Integration Catalog โ€” 287+ production-ready integrations

๐Ÿ”ฅ InfiniRetri (NEW)

Attention-based infinite context retrieval โ€” 100% accuracy on Needle-In-a-Haystack up to 1M+ tokens.

from rlm_toolkit.retrieval import InfiniRetriever

# Retrieve from 1M+ token documents
retriever = InfiniRetriever("Qwen/Qwen2.5-0.5B-Instruct")
answer = retriever.retrieve(
    context=million_token_doc,
    question="What is the secret code?"
)

# Or use automatic routing in RLM
from rlm_toolkit import RLM, RLMConfig

config = RLMConfig(
    use_infiniretri=True,
    infiniretri_threshold=100_000,  # Auto-switch at 100K tokens
)
rlm = RLM.from_ollama("llama3", config=config)
result = rlm.run(huge_document, "Summarize")  # Automatically uses InfiniRetri

Based on arXiv:2502.12962 โ€” requires pip install infini-retri

๐Ÿง  Hierarchical Memory (H-MEM) (NEW)

Multi-level persistent memory with semantic consolidation โ€” memories that learn and evolve.

from rlm_toolkit.memory import HierarchicalMemory, SecureHierarchicalMemory

# Basic H-MEM
hmem = HierarchicalMemory()
hmem.add_episode("User asked about weather")
hmem.add_episode("AI responded with forecast")
hmem.consolidate()  # Auto-creates traces, categories, domains

results = hmem.retrieve("weather")

# Secure H-MEM with encryption and trust zones
smem = SecureHierarchicalMemory(
    agent_id="agent-001",
    trust_zone="zone-secure"
)
smem.add_episode("Confidential data")
smem.grant_access("agent-002", "zone-secure")

4-Level Architecture:

Level 3: DOMAIN    โ†’ High-level knowledge
Level 2: CATEGORY  โ†’ Semantic categories  
Level 1: TRACE     โ†’ Consolidated memories
Level 0: EPISODE   โ†’ Raw interactions

Based on arXiv H-MEM paper (July 2025)

๐ŸŒ‰ Memory Bridge v2.1 (NEW)

Enterprise-scale cross-session persistence โ€” Zero-friction Auto-Mode with 56x token compression.

# Zero-config enterprise context (recommended)
from rlm_toolkit.memory_bridge.mcp_tools_v2 import rlm_enterprise_context

result = rlm_enterprise_context(
    query="What's the architecture of this project?",
    max_tokens=3000
)
print(result["context"])  # Semantic routing loads only relevant facts

v2.1 Features:

Feature Description
Auto-Mode ๐Ÿ†• Zero-config orchestration for new projects
Hierarchical Memory ๐Ÿ†• L0-L3 levels: Project โ†’ Domain โ†’ Module โ†’ Code
Semantic Routing ๐Ÿ†• 56x compression via similarity-based context loading
Git Auto-Extract ๐Ÿ†• Facts extracted automatically on each commit
Causal Reasoning ๐Ÿ†• Track decisions with reasons, constraints, alternatives
Smart Cold Start ๐Ÿ†• Sub-second project discovery (0.04s for 79K LOC)
18 MCP Tools Full IDE integration via Model Context Protocol

Hierarchical Memory (L0-L3):

L0: PROJECT   โ†’ High-level architecture, tech stack
L1: DOMAIN    โ†’ Feature areas (auth, api, database)
L2: MODULE    โ†’ Per-file knowledge  
L3: CODE      โ†’ Function-level facts with line refs

VS Code Extension v2.1.0:

code --install-extension rlm-toolkit-2.1.0.vsix
  • Real-time dashboard with L0-L3 visualization
  • Discover / Git Hook / Index Embeddings buttons
  • Health Check status for Memory Store and Semantic Router

Git Hook Auto-Extraction:

# Install hook for automatic fact extraction
rlm_install_git_hooks(hook_type="post-commit")
# Now every commit auto-extracts: classes, functions, major changes

Based on Graphiti โ€” Full Documentation

๐Ÿงฌ Self-Evolving LLMs (NEW)

LLMs that improve reasoning through usage โ€” no human supervision required.

from rlm_toolkit.evolve import SelfEvolvingRLM, EvolutionStrategy
from rlm_toolkit.providers import OllamaProvider

# Create self-evolving RLM
evolve = SelfEvolvingRLM(
    provider=OllamaProvider("llama3"),
    strategy=EvolutionStrategy.CHALLENGER_SOLVER
)

# Solve with self-refinement
answer = evolve.solve("What is 25 * 17?")
print(f"Answer: {answer.answer}, Confidence: {answer.confidence}")

# Run training loop (generates challenges โ†’ solves โ†’ improves)
metrics = evolve.training_loop(iterations=10, domain="math")
print(f"Success rate: {metrics.success_rate}")

Strategies:

  • SELF_REFINE โ€” Iterative self-improvement
  • CHALLENGER_SOLVER โ€” R-Zero co-evolutionary loop
  • EXPERIENCE_REPLAY โ€” Learn from past solutions

Based on R-Zero (arXiv:2508.05004)

๐Ÿค– Multi-Agent Framework (NEW)

Decentralized P2P agents inspired by Meta Matrix โ€” no central orchestrator bottleneck.

from rlm_toolkit.agents import MultiAgentRuntime, SecureAgent, EvolvingAgent

# Create runtime
runtime = MultiAgentRuntime()

# Register agents with Trust Zones
runtime.register(SecureAgent("analyst", "Data Analyst", trust_zone="internal"))
runtime.register(EvolvingAgent("solver", "Problem Solver", llm_provider=provider))

# Run message through agents
from rlm_toolkit.agents import AgentMessage
message = AgentMessage(content="Analyze this data", routing=["analyst", "solver"])
result = runtime.run(message)

Agent Types:

  • SecureAgent โ€” H-MEM Trust Zones integration
  • EvolvingAgent โ€” Self-improving via R-Zero
  • SecureEvolvingAgent โ€” Both combined

Based on Meta Matrix (arXiv 2025)

๐ŸŽฏ DSPy-Style Optimization (NEW)

Automatic prompt optimization โ€” define what, not how.

from rlm_toolkit.optimize import Signature, Predict, ChainOfThought, BootstrapFewShot

# Define signature
sig = Signature(
    inputs=["question", "context"],
    outputs=["answer"],
    instructions="Answer based on context"
)

# Use with Chain of Thought
cot = ChainOfThought(sig, provider)
result = cot(question="What is X?", context="X is 42")

# Auto-optimize with few-shot selection
optimizer = BootstrapFewShot(metric=lambda p, g: p["answer"] == g["answer"])
optimized = optimizer.compile(Predict(sig, provider), trainset=examples)

Modules: Predict, ChainOfThought, SelfRefine
Optimizers: BootstrapFewShot, PromptOptimizer

Inspired by Stanford DSPy

๐Ÿ“ฆ Installation

# Basic
pip install rlm-toolkit

# With all providers
pip install rlm-toolkit[all]

# Development
pip install -e ".[dev]"

๐Ÿ”ง Usage

Basic

from rlm_toolkit import RLM, RLMConfig

# With configuration
config = RLMConfig(
    max_iterations=50,
    max_cost=5.0,  # USD
)

rlm = RLM.from_openai("gpt-4o", config=config)
result = rlm.run(context, query)

With Memory

from rlm_toolkit.memory import EpisodicMemory

memory = EpisodicMemory(max_entries=1000)
rlm = RLM.from_ollama("llama3", memory=memory)

# Memory persists across runs
result1 = rlm.run(doc1, "Summarize this")
result2 = rlm.run(doc2, "Compare with previous")

With Observability

from rlm_toolkit.observability import Tracer, CostTracker

tracer = Tracer(service_name="my-app")
cost_tracker = CostTracker(budget=10.0)

rlm = RLM.from_openai("gpt-4o", tracer=tracer, cost_tracker=cost_tracker)

๐Ÿ”’ Security

RLM-Toolkit implements CIRCLE-compliant security with v1.2.1 hardening:

  • AES-256-GCM โ€” Mandatory authenticated encryption for all persistent data
  • Fail-Closed โ€” No XOR fallback; raises error if cryptography unavailable
  • Rate Limiting โ€” 60s cooldown on reindex to prevent I/O exhaustion
  • AST Analysis โ€” Block dangerous imports before execution
  • Sandboxed REPL โ€” Isolated code execution with timeouts
  • Virtual Filesystem โ€” Quota-enforced file operations
  • Attack Detection โ€” Obfuscation and indirect attack patterns
from rlm_toolkit import RLMConfig, SecurityConfig

config = RLMConfig(
    security=SecurityConfig(
        sandbox=True,
        max_execution_time=30.0,
        max_memory_mb=512,
    )
)

๐Ÿ“Š Benchmarks

Based on RLM paper methodology:

Benchmark Score
OOLONG-Pairs TBD
CIRCLE Security ~95%

๐Ÿ› ๏ธ CLI

# Run a query
rlm run --model ollama:llama3 --context file.txt --query "Summarize"

# Interactive REPL
rlm repl --model openai:gpt-4o

# Cost tracking
rlm trace --session latest

๐Ÿ“š Documentation

v2.1.0: 162 files (81 EN + 81 RU) โ€” NIOKR 10/10

Category EN RU
Concepts 25 25
Tutorials 13 13
Examples 10 10
How-To 20 20
Reference 6 6
Memory Bridge 7 7

๐Ÿค Contributing

# Clone repo
git clone https://github.com/DmitrL-dev/AISecurity.git
cd AISecurity/sentinel-community/rlm-toolkit

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Lint
ruff check rlm_toolkit/

๐Ÿ“„ License

Apache 2.0 โ€” see LICENSE

๐Ÿ™ Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlm_toolkit-2.3.1.tar.gz (310.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rlm_toolkit-2.3.1-py3-none-any.whl (379.2 kB view details)

Uploaded Python 3

File details

Details for the file rlm_toolkit-2.3.1.tar.gz.

File metadata

  • Download URL: rlm_toolkit-2.3.1.tar.gz
  • Upload date:
  • Size: 310.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for rlm_toolkit-2.3.1.tar.gz
Algorithm Hash digest
SHA256 8f028aa80d8dbd8184a74c3dd141737c0bcec2eac6bef478d479e918ed2a4466
MD5 77280d5573bb1dd62d019ff34f36c997
BLAKE2b-256 b3db8bc462774dbba9df57f0af20d461617ad65a7f6e36b86c42a2d0bfea50e6

See more details on using hashes here.

File details

Details for the file rlm_toolkit-2.3.1-py3-none-any.whl.

File metadata

  • Download URL: rlm_toolkit-2.3.1-py3-none-any.whl
  • Upload date:
  • Size: 379.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for rlm_toolkit-2.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ffd0c889c91fd2aed645fba6fa19dc43c6f9abb95ee58e1da984597e0dfa3b74
MD5 21a29e3ed089781c3ad446fd5f4c48de
BLAKE2b-256 ae6c08f4a0b4e5b871b915e568a01208f521fdf2697013d91f9d5eaab0366793

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page