groknroll - CLI coding agent with unlimited context via Recursive Language Models (RLM). Local, unlimited context, autonomous.
Project description
Recursive Language Models (RLMs)
Full Paper • Blogpost • Documentation • RLM Minimal
Overview
Recursive Language Models (RLMs) are a task-agnostic inference paradigm for language models (LMs) to handle near-infinite length contexts by enabling the LM to programmatically examine, decompose, and recursively call itself over its input. RLMs replace the canonical llm.completion(prompt, model) call with a rlm.completion(prompt, model) call. RLMs offload the context as a variable in a REPL environment that the LM can interact with and launch sub-LM calls inside of.
This repository provides an extensible inference engine for using RLMs around standard API-based and local LLMs. The initial experiments and idea were proposed in a blogpost in 2025, with expanded results in an arXiv preprint.
Architecture: Context Layer vs Coding Layer
IMPORTANT: RLM is used ONLY for context/memory management, NOT for coding.
┌─────────────────────────────────────────────────┐
│ RLM (Context Layer) │
│ • Store context at 20% threshold │
│ • Retrieve relevant context │
│ • Manage conversation history │
│ • NO coding, NO tool execution, NO decisions │
└─────────────────────┬───────────────────────────┘
│ context
▼
┌─────────────────────────────────────────────────┐
│ User-Selected LLM/API (Coding Layer) │
│ • Any provider: OpenAI, Anthropic, Google, etc │
│ • Execute tools (bash, read, write, edit) │
│ • Generate code, make decisions │
└─────────────────────────────────────────────────┘
Key Points:
- RLM = Context management only (store, retrieve, persist)
- Coding = User-selected LLM/API (OpenAI, Anthropic, Google, Groq, local models, etc.)
- Context is persisted to RLM database at 20% of context window (no compaction/summarization)
- Full context preserved in queryable format for true unlimited context
[!NOTE] This repository contains inference code for RLMs with support for various sandbox environments. Open-source contributions are welcome. This repository is maintained by the authors of the paper from the MIT OASYS lab.
Installation
groknroll is now available on PyPI! 🎉
pip install groknroll
To install the latest from main:
pip install git+https://github.com/tekcin/groknroll.git
⚠️ Installation Troubleshooting
If you get an error like Requires-Python >=3.11, you need Python 3.11 or higher:
# Option 1: Use the project's virtual environment (RECOMMENDED)
cd /Users/claude/RLM/rlm
source .venv/bin/activate
uv pip install groknroll
# Option 2: Install Python 3.11+ system-wide
brew install python@3.11
python3.11 -m venv groknroll-env
source groknroll-env/bin/activate
pip install groknroll
See QUICK_FIX.md and INSTALLATION_FIX_GUIDE.md for detailed troubleshooting.
Quick Setup
Set up the dependencies with uv (or your virtual environment of choice):
curl -LsSf https://astral.sh/uv/install.sh | sh
uv init && uv venv --python 3.12 # change version as needed
uv pip install -e .
This project includes a Makefile to simplify common tasks.
make install: Install base dependencies.make check: Run linter, formatter, and tests.
To run a quick test, the following will run an RLM query with the OpenAI client using your environment variable OPENAI_API_KEY (feel free to change this). This will generate console output as well as a log which you can use with the visualizer to explore the trajectories.
make quickstart
The default RLM client uses a REPL environment that runs on the host process through Python exec calls. It uses the same virtual environment as the host process (i.e. it will have access to the same dependencies), but with some limitations in its available global modules. As an example, we can call RLM completions using GPT-5-nano:
from groknroll import RLM
rlm = RLM(
backend="openai",
backend_kwargs={"model_name": "gpt-5-nano"},
verbose=True, # For printing to console with rich, disabled by default.
)
print(rlm.completion("Print me the first 100 powers of two, each on a newline.").response)
Oracle Agent - Codebase Knowledge System 🔮
The Oracle Agent is an RLM-powered tool that has unlimited context and knows everything about your codebase. It can answer any question about your code by leveraging RLM's infinite context capabilities.
from groknroll import OracleAgent
# Initialize Oracle for your project
oracle = OracleAgent(
project_path=".",
backend="openai",
model="gpt-4o-mini"
)
# Ask any question about your codebase
response = oracle.ask("Where is the RLM class defined?")
print(response.answer)
# Use convenience methods
oracle.find_class("RLM")
oracle.find_function("completion")
oracle.get_architecture_overview()
oracle.how_to_add_feature("support for Claude AI backend")
Features:
- Unlimited Context: Handles arbitrarily large codebases via RLM
- Automatic Indexing: Parses files, functions, classes, imports using AST
- Semantic Understanding: Understands what your code does, not just keywords
- Comprehensive Answers: Detailed explanations with code examples and sources
See ORACLE_AGENT.md for full documentation.
OSINT Agent - Dark Web Reconnaissance 🔍
The OSINT Agent provides dark web reconnaissance and threat intelligence gathering capabilities through Tor proxy integration.
# Quick scan across 16 dark web search engines
groknroll osint scan "ransomware lockbit"
# Deep investigation with LLM analysis
groknroll osint deep "threat actor APT29" --threads 10
# Check Tor connectivity
groknroll osint status
# Extract artifacts from a file
groknroll osint extract scraped_data.txt --type email
Python API:
from groknroll.agents import OsintAgent
from pathlib import Path
# Initialize OSINT agent
osint = OsintAgent(
project_path=Path("."),
backend="anthropic",
model="claude-sonnet-4-20250514"
)
# Run investigation
result = osint.investigate("ransomware lockbit", deep=True)
print(f"Found {result.artifacts_count} artifacts")
print(f"Report saved to: {result.report_path}")
Features:
- 16 Dark Web Engines: Search Ahmia, Tor66, and 14 more via Tor
- Multi-threaded Scraping: Concurrent content extraction
- Artifact Extraction: Emails, Bitcoin/Ethereum addresses, .onion domains, IPs
- LLM Analysis: Query refinement, result filtering, intelligence reports
- Markdown Reports: Comprehensive investigation documentation
Requirements: Tor daemon running on localhost:9050
Tor Setup
macOS (Homebrew):
brew install tor
brew services start tor
Linux (Debian/Ubuntu):
sudo apt install tor
sudo systemctl start tor
Linux (Fedora/RHEL):
sudo dnf install tor
sudo systemctl start tor
Verify Tor is running:
groknroll osint status
# Should show: ✅ Tor proxy connected (127.0.0.1:9050)
Threat Blocklist (Safe Search)
OSINT searches automatically filter dangerous sites including ransomware infrastructure, zero-click exploits, malware distribution, and illegal content:
from groknroll.osint import (
OSINTBlocklist,
ThreatCategory,
is_url_safe,
check_url_threat
)
# Check if a URL is safe
if is_url_safe("http://example.onion"):
print("URL appears safe")
else:
threat = check_url_threat("http://example.onion")
print(f"BLOCKED: {threat['category']} - {threat['description']}")
# Use blocklist directly
blocklist = OSINTBlocklist()
# Filter search results
safe_results, blocked = blocklist.filter_results(search_results)
print(f"Blocked {len(blocked)} dangerous sites")
# View tracked threat actors
print(blocklist.get_threat_actors())
# ['BlackCat/ALPHV', 'Clop', 'Conti', 'LockBit', 'NSO Group', ...]
Blocked Categories:
| Category | Description |
|---|---|
RANSOMWARE_INFRASTRUCTURE |
LockBit, BlackCat, Conti, REvil, Hive, Clop, etc. |
ZERO_CLICK_EXPLOIT |
Pegasus/NSO Group, Predator, Candiru infrastructure |
MALWARE_DISTRIBUTION |
Emotet, QakBot, TrickBot, IcedID, LummaC2 |
COMMAND_AND_CONTROL |
Known C2 servers and beacons |
EXPLOIT_KIT |
Angler, RIG, Magnitude exploit kits |
ILLEGAL_CONTENT |
CSAM, trafficking (NEVER access) |
Sources: CISA advisories, FBI Flash reports, Citizen Lab, Amnesty International, community threat feeds.
Enhanced OSINT Agent (Browser Automation)
For interactive dark web sites, the Enhanced OSINT Agent provides LLM-controlled browser automation via Tor:
# Install browser dependencies
pip install groknroll[browser]
playwright install chromium
from groknroll.agents import EnhancedOsintAgent
from pathlib import Path
import asyncio
# Initialize enhanced agent
agent = EnhancedOsintAgent(
project_path=Path("."),
backend="anthropic",
model="claude-sonnet-4-20250514"
)
# Deep investigation with browser automation
async def investigate():
result = await agent.deep_investigate(
"threat actor APT29",
use_browser=True, # Enable browser automation
parallel=True, # Use multi-agent hierarchy
)
print(f"Browser evidence: {len(result.browser_evidence)} pages captured")
print(f"Agents used: {result.total_agents_used}")
return result
asyncio.run(investigate())
Enhanced Features:
- Browser Automation: LLM-controlled headless Chromium via Tor
- Multi-Agent Hierarchy: Spawn subordinate agents for parallel investigations
- Evidence Capture: Automatic screenshots and content archival
- Secrets Management: Secure handling of credentials
Multi-Agent Hierarchy 🤖
groknroll supports hierarchical multi-agent systems inspired by agent-zero:
from groknroll.agents import HierarchicalAgent, AgentRole, DEFAULT_PROFILES
from groknroll.core import AgentConfig
from pathlib import Path
# Create root agent
config = AgentConfig(
name="lead",
description="Lead investigator",
capabilities=[],
model="gpt-4o"
)
agent = HierarchicalAgent(
config=config,
project_path=Path("."),
profile=DEFAULT_PROFILES[AgentRole.PLANNER],
)
# Spawn specialized subordinates
async def investigate():
# Delegate to researcher
task = await agent.delegate_task(
"Research the codebase architecture",
role=AgentRole.RESEARCHER,
wait=True
)
print(f"Result: {task.result.message}")
# Delegate to developer
dev_task = await agent.delegate_task(
"Implement the feature",
role=AgentRole.DEVELOPER,
wait=True
)
Available Profiles:
| Role | Description |
|---|---|
DEFAULT |
General purpose assistant |
DEVELOPER |
Code-focused with full write access |
RESEARCHER |
Information gathering and analysis |
HACKER |
Security testing (authorized only) |
ANALYST |
Data and intelligence analysis |
PLANNER |
Strategic planning and architecture |
Extension Hook System 🔌
Customize agent behavior with lifecycle hooks:
from groknroll.core import Extension, HookPoint, HookContext, register_extension
class MyExtension(Extension):
name = "my_extension"
priority = 50
hooks = [HookPoint.BEFORE_LLM_CALL]
async def execute(self, context: HookContext) -> HookContext:
# Inject custom context before LLM calls
prompt = context.get("prompt", "")
context.set("prompt", f"{prompt}\n\nRemember to be concise.")
return context
# Register globally
register_extension(MyExtension())
Available Hook Points:
AGENT_INIT,AGENT_CLEANUP- Agent lifecycleMONOLOGUE_START,MONOLOGUE_END- Task executionBEFORE_LLM_CALL,AFTER_LLM_CALL- LLM interactionsBEFORE_TOOL_EXECUTION,AFTER_TOOL_EXECUTION- Tool callsCONTEXT_THRESHOLD_REACHED- Memory managementERROR_OCCURRED,INTERVENTION_REQUESTED- Error handling
Vector Memory System 🧠
Semantic search memory for agents:
from groknroll.core import VectorMemory, MemoryArea
from pathlib import Path
# Initialize memory
memory = VectorMemory(storage_path=Path("./memory"))
# Save memories
memory.save(
content="The API endpoint is at /api/v1/users",
area=MemoryArea.SOLUTIONS,
metadata={"source": "documentation"}
)
# Search semantically
results = memory.search(
query="user API endpoint",
limit=5,
threshold=0.7
)
for item in results:
print(f"[{item.relevance_score:.2f}] {item.content}")
# Consolidate similar memories
consolidated = memory.consolidate(similarity_threshold=0.9)
Secrets Management 🔐
Secure handling of sensitive data with streaming-safe masking:
from groknroll.core import SecretsManager, StreamingSecretsFilter
# Initialize from environment
secrets = SecretsManager(auto_load_env=True)
# Add manual secrets
secrets.add_secret("API_KEY", "sk-abc123secret")
# Mask in output
text = "My API key is sk-abc123secret"
safe_text = secrets.mask(text)
# Output: "My API key is §§secret(API_KEY)"
# Streaming filter (for LLM responses)
filter = StreamingSecretsFilter()
filter.add_secret("PASSWORD", "hunter2")
for chunk in llm_stream:
safe_chunk = filter.process_chunk(chunk)
print(safe_chunk, end="")
# Flush remaining buffer
print(filter.flush())
2600 Mode - Security Research 🔒
Named after the legendary 2600 Magazine, this mode provides security research tools for:
- CTF competitions and challenges
- Personal password/wallet recovery
- Binary analysis and reverse engineering
- Malware analysis (static)
IMPORTANT: For authorized security research only.
Quick Start
# Initialize with authorization context
groknroll 2600 init --auth ctf --scope "PicoCTF 2024"
# Check status and available tools
groknroll 2600 status
Hash Tools
# Identify hash type
groknroll 2600 hash identify "5d41402abc4b2a76b9719d911017c592"
# Crack hash (personal recovery only)
groknroll 2600 hash crack "5d41402abc4b2a76b9719d911017c592" -w /path/to/wordlist.txt
# Generate hash for testing
groknroll 2600 hash generate "hello" --type sha256
Binary Analysis
# Analyze binary
groknroll 2600 analyze /path/to/binary --strings --functions
# Disassemble function
groknroll 2600 disasm /path/to/binary --function main
Malware Analysis (Static)
# Analyze sample (no execution)
groknroll 2600 malware /path/to/sample.exe
# With VirusTotal check
groknroll 2600 malware /path/to/sample.exe --virustotal
CTF Utilities
# Decode (tries all encodings)
groknroll 2600 ctf decode "SGVsbG8gV29ybGQ="
# Caesar cipher bruteforce
groknroll 2600 ctf caesar "Uryyb Jbeyq"
# XOR single-byte bruteforce
groknroll 2600 ctf xor "48656c6c6f"
# Generate cyclic pattern for buffer overflow
groknroll 2600 ctf pattern 200
# Find offset in pattern
groknroll 2600 ctf offset "aaaabaaacaaa" "0x61616166"
Python API
from groknroll.mode2600 import (
Mode2600, SecurityContext, AuthorizationType,
identify_hash, crack_hash,
BinaryAnalyzer, analyze_binary,
MalwareSandbox, analyze_sample,
CTFHelper, CryptoSolver, PwnHelper,
)
# Set authorization context
mode = Mode2600()
mode.set_context(SecurityContext(
authorization=AuthorizationType.CTF_COMPETITION,
scope="CTF challenge",
user_consent=True,
))
# Identify and crack hash
result = identify_hash("5d41402abc4b2a76b9719d911017c592")
print(f"Type: {result.possible_types[0].value}")
# Analyze binary
analyzer = BinaryAnalyzer()
info = analyzer.analyze("/path/to/binary")
print(f"Format: {info.format.value}, Arch: {info.architecture.value}")
# CTF crypto
solver = CryptoSolver()
for shift, plaintext in solver.caesar_bruteforce("Uryyb"):
print(f"ROT{shift}: {plaintext}")
Optional Security Tools Installation
2600 mode works with basic tools (file, strings, objdump) but for full functionality, install these security tools:
macOS (Homebrew):
# Core tools
brew install radare2 binwalk yara ssdeep nmap
# Hash cracking
brew install hashcat john-jumbo
# Additional analysis
brew install ghidra # GUI decompiler (requires Java)
Linux (Debian/Ubuntu):
# Core tools
sudo apt update
sudo apt install radare2 binwalk yara ssdeep nmap file
# Hash cracking
sudo apt install hashcat john
# Additional analysis
sudo apt install ghidra # Or download from ghidra-sre.org
# ROPgadget (via pip)
pip install ROPgadget
Linux (Fedora/RHEL):
sudo dnf install radare2 binwalk yara ssdeep nmap hashcat john
# ROPgadget
pip install ROPgadget
Python tools (all platforms):
pip install pwntools ropper capstone keystone-engine unicorn
Verify installation:
groknroll 2600 status
# Shows available/missing tools
| Category | Tools | Purpose |
|---|---|---|
| Hash cracking | hashcat, john |
Password recovery |
| Binary analysis | radare2, objdump, ghidra |
Disassembly, decompilation |
| Malware | yara, ssdeep, binwalk |
Signature matching, fuzzy hashing |
| Network | nmap, wireshark |
Reconnaissance |
| CTF/Pwn | ROPgadget, pwntools |
Exploit development |
REPL Environments
We support two types of REPL environments -- isolated, and non-isolated. Non-isolated environments (default) run code execution on the same machine as the RLM (e.g. through exec), which is pretty reasonable for some local low-risk tasks, like simple benchmarking, but can be problematic if the prompts or tool calls can interact with malicious users. Fully isolated environments used Cloud-based sandboxes (e.g. Prime Sandboxes, Modal Sandboxes) to run code generated by the RLM, ensuring completely isolation from the host process. Environments can be added, but we natively support the following: local (default), modal, prime.
rlm = RLM(
environment="...", # "local", "docker", "modal", "prime"
environment_kwargs={...},
)
Local Environments
The default local environment LocalREPL runs in the same process as the RLM itself, with specified global and local namespaces for minimal security. Using this REPL is generally safe, but should not be used for production settings. It also shares the same virtual environment (e.g. Conda or uv) as the host process.
Docker
(requires Docker installed)
We also support a Docker-based environment called DockerREPL that launches the REPL environment as a Docker image. By default, we use the python:3.11-slim image, but the user can specify custom images as well.
Isolated Environments
We support several different REPL environments that run on separate, cloud-based machines. Whenever a recursive sub-call is made in these instances, it is requested from the host process.
Modal Sandboxes 
To use Modal Sandboxes as the REPL environment, you need to install and authenticate your Modal account.
uv add modal # add modal library
modal setup # authenticate account
Prime Intellect Sandboxes 
[!NOTE] Prime Intellect Sandboxes are currently a beta feature. See the documentation for more information. We noticed slow runtimes when using these sandboxes, which is currently an open issue.
To use Prime Sandboxes, install the SDK and set your API key:
uv pip install -e ".[prime]"
export PRIME_API_KEY=...
Model Providers
We currently support most major clients (OpenAI, Anthropic), as well as the router platforms (OpenRouter, Portkey, LiteLLM). For local models, we recommend using vLLM (which interfaces with the OpenAI client). To view or add support for more clients, start by looking at rlm/clients/.
Interactive Provider Configuration
Use the /connect command in the REPL to interactively configure LLM providers:
groknroll
> /connect # Interactive provider picker
> /connect openai # Configure specific provider
> /providers # List configured providers
Supported Providers:
| Provider | Env Variable | Models |
|---|---|---|
| openai | OPENAI_API_KEY |
GPT-4o, o1 |
| anthropic | ANTHROPIC_API_KEY |
Claude 3.5 Sonnet |
| gemini | GEMINI_API_KEY |
Gemini 2.0 Flash |
| xai | XAI_API_KEY |
Grok-2 |
| groq | GROQ_API_KEY |
Llama 3.3 70B |
| openrouter | OPENROUTER_API_KEY |
100+ models |
| azure | AZURE_OPENAI_API_KEY |
Azure OpenAI |
API keys are saved to .env and validated with a test API call.
Relevant Reading
- [Dec '25] Recursive Language Models arXiv
- [Oct '25] Recursive Language Models Blogpost
If you use this code or repository in your research, please cite:
@misc{zhang2025recursivelanguagemodels,
title={Recursive Language Models},
author={Alex L. Zhang and Tim Kraska and Omar Khattab},
year={2025},
eprint={2512.24601},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2512.24601},
}
Optional Debugging: Visualizing RLM Trajectories
We additionally provide a simple visualizer tool to examine and view the code, sub-LM, and root-LM calls of an RLM trajectory. To save log files (.jsonl) on every completion call that can be viewed in the visualizer, initialize the RLMLogger object and pass it into the RLM on initialization:
from groknroll.logger import RLMLogger
from groknroll import RLM
logger = RLMLogger(log_dir="./logs")
rlm = RLM(
...
logger=logger
)
To run the visualizer locally, we use Node.js and shadcn/ui:
cd visualizer/
npm run dev # default localhost:3001
You'll have the option to select saved .jsonl files
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file groknroll-2.2.32.tar.gz.
File metadata
- Download URL: groknroll-2.2.32.tar.gz
- Upload date:
- Size: 489.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33441cc62d592e26b08088074d6014df76e3e8d75d9d410fddf950fb2db78666
|
|
| MD5 |
d22e867d8ba6819abc6f4ef707bdd6dd
|
|
| BLAKE2b-256 |
e095e3628fe9813e980904a8c433e761b9c2cdd0152f2299b70e0886c3be6dbf
|
File details
Details for the file groknroll-2.2.32-py3-none-any.whl.
File metadata
- Download URL: groknroll-2.2.32-py3-none-any.whl
- Upload date:
- Size: 573.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46cb1aba8bac14afab7b677f2ab84b590870dc88ba06172980a0a1c2f990dcb2
|
|
| MD5 |
825d0cca7e616cf412376b4964dbc9f0
|
|
| BLAKE2b-256 |
632f6c523fedec87b6d0ef16d96e678a215f511adba939ba790c16f42de0baec
|