Skip to main content

MCP server for recursive LLM reasoning—load context, iterate with search/code/think tools, converge on answers

Project description

Aleph

License: MIT Python 3.10+ PyPI version

Aleph is an MCP (Model Context Protocol) server that enables AI assistants to analyze documents too large for their context window. By implementing a Recursive Language Model (RLM) approach, it allows models to search, explore, and compute over massive datasets without filling up the context window.

Key Capabilities

  • Unlimited Context: Load files as large as your system RAM allows—gigabytes of data accessible via simple queries. The LLM never sees the raw file; it queries a Python process that holds the data in memory.
  • Navigation Tools: High-performance regex search and line-based navigation.
  • Compute Sandbox: Execute Python code over loaded content for parsing and analysis.
  • Evidence Tracking: Automatic citation of source text for grounded answers.
  • Recursive Reasoning: Spawn sub-agents to process document chunks in parallel.

How "Unlimited Context" Works

Traditional LLMs are limited by their context window (~200K tokens). Aleph sidesteps this entirely:

┌─────────────────┐     queries      ┌─────────────────────────┐
│   LLM Context   │ ───────────────► │   Python Process (RAM)  │
│   (~200K tokens)│ ◄─────────────── │   (8GB, 32GB, 64GB...)  │
│                 │   small results  │   └── your_file.txt     │
└─────────────────┘                  └─────────────────────────┘
  • Python loads the entire file into RAM as a string
  • The LLM queries it via search(), peek(), lines(), etc.
  • Only query results (kilobytes) enter the LLM's context—never the full file
  • Your RAM is the limit, not the model's context window (with a default 1GB safety cap on action tools)

You can load multiple files or entire repos as separate contexts and query them independently.

A 50MB log file? The LLM sees ~1KB of search results. A 2GB database dump? Same—just the slices you ask for.

By default, Aleph sets a 1GB max file size for read operations to avoid accidental overload, but you can raise it with --max-file-size based on your machine. This cap applies to load_file / read_file; load_context still accepts any size you can supply in-memory. Writes default to 100MB via --max-write-bytes.

Installation

pip install "aleph-rlm[mcp]"

After installation, you can automatically configure popular MCP clients:

aleph-rlm install

MCP Server

Run Aleph as an MCP server with:

aleph

Use --enable-actions to allow file and command tools.

Full Power Mode

For maximum capability with minimal setup, use this configuration:

aleph --enable-actions --workspace-mode any --tool-docs concise

This enables:

  • All action tools (read_file, write_file, run_command, run_tests)
  • Any git repo access (not limited to a single workspace root)
  • Concise tool descriptions (cleaner MCP tool list)

Or in MCP client config:

{
  "mcpServers": {
    "aleph": {
      "command": "aleph",
      "args": ["--enable-actions", "--workspace-mode", "any", "--tool-docs", "concise"]
    }
  }
}

For higher limits:

aleph --enable-actions --workspace-mode any --tool-docs concise --timeout 120 --max-output 50000

Integration

Claude Desktop / Cursor / Windsurf

Add Aleph to your mcpServers configuration:

{
  "mcpServers": {
    "aleph": {
      "command": "aleph",
      "args": ["--enable-actions", "--tool-docs", "concise"]
    }
  }
}

Install the /aleph skill for the RLM workflow prompt:

mkdir -p ~/.claude/commands
cp /path/to/aleph/docs/prompts/aleph.md ~/.claude/commands/aleph.md

Then use it like:

/aleph: Find the root cause of this test failure and propose a fix.

Claude Code

To use Aleph with Claude Code, register the MCP server and install the workflow prompt:

# Register the MCP server
claude mcp add aleph aleph -- --enable-actions --tool-docs concise

# Add the workflow prompt
mkdir -p ~/.claude/commands
cp docs/prompts/aleph.md ~/.claude/commands/aleph.md

Codex CLI

Add to ~/.codex/config.toml:

[mcp_servers.aleph]
command = "aleph"
args = ["--enable-actions", "--tool-docs", "concise"]

How It Works

  1. Load: Store a document in external memory via load_context or load_file (with --enable-actions).
  2. Explore: Search for patterns using search_context or view slices with peek_context.
  3. Compute: Run Python scripts over the content in a secure sandbox via exec_python.
  4. Finalize: Generate an answer with linked evidence and citations using finalize.

Recursion: Handling Very Large Inputs

When content is too large even for slice-based exploration, Aleph supports recursive decomposition:

  1. Chunk the content into manageable pieces
  2. Spawn sub-agents to analyze each chunk
  3. Synthesize findings into a final answer
# exec_python
chunks = chunk(100_000)  # split into ~100K char pieces
results = [sub_query("Extract key findings.", context_slice=c) for c in chunks]
final = sub_query("Synthesize into a summary:", context_slice="\n\n".join(results))

sub_query can use an API backend (OpenAI-compatible) or spawn a local CLI (Claude, Codex, Gemini) - whichever is available.

Sub-query backends

When ALEPH_SUB_QUERY_BACKEND is auto (default), Aleph chooses the first available backend:

  1. API - if API credentials are available
  2. claude CLI - if installed
  3. codex CLI - if installed
  4. gemini CLI - if installed

Quick setup:

# OpenAI-compatible API (OpenAI, Groq, Together, local LLMs, etc.)
export ALEPH_SUB_QUERY_API_KEY=sk-...
export ALEPH_SUB_QUERY_MODEL=gpt-5.2-codex

# Optional: custom endpoint
export ALEPH_SUB_QUERY_URL=https://api.your-provider.com/v1

Note: Some MCP clients don't reliably pass env vars from their config to the server process. If sub_query reports "API key not found" despite your client's MCP settings, add the exports to your shell profile (~/.zshrc or ~/.bashrc) and restart your terminal/client.

For a full list of options, see docs/CONFIGURATION.md.

Available Tools

Aleph exposes the full toolset below.

Core exploration

Tool Description
load_context Store text or JSON in external memory.
list_contexts List loaded contexts and metadata.
peek_context View specific line or character ranges.
search_context Perform regex searches with surrounding context.
chunk_context Split content into navigable chunks.
diff_contexts Diff two contexts (text or JSON).
exec_python Run Python code over the loaded content.
get_variable Retrieve a variable from the exec_python sandbox.

Reasoning workflow

Tool Description
think Structure reasoning for complex problems.
get_status Show current session state.
get_evidence Retrieve collected citations.
evaluate_progress Self-evaluate progress with convergence tracking.
summarize_so_far Summarize progress on long tasks.
finalize Complete with answer and evidence.

Recursion

Tool Description
sub_query Spawn a sub-agent on a content slice.

Session management

Tool Description
save_session Persist current session to file.
load_session Load a saved session from file.

Remote MCP orchestration

Tool Description
add_remote_server Register a remote MCP server.
list_remote_servers List registered remote MCP servers.
list_remote_tools List tools available on a remote server.
call_remote_tool Call a tool on a remote MCP server.
close_remote_server Close a remote MCP server connection.

Action tools

Enabled with the --enable-actions flag. Use --workspace-root and --workspace-mode (fixed, git, any) to control scope.

Tool Description
load_file Load a workspace file into a context.
read_file / write_file File system access (workspace-scoped).
run_command Shell execution.
run_tests Execute test commands (supports optional cwd).

Configuration

For full configuration options (limits, budgets, and backend details), see docs/CONFIGURATION.md.

Changelog

0.5.6

  • Removed deprecated recipe workflow and aider backend references.
  • Added Gemini CLI sub-query backend and updated backend priority docs.
  • Improved sub-query system prompt for structured output.
  • Added Full Power Mode docs and made installer defaults max power.
  • Added --max-write-bytes and aligned file size limits across docs.
  • Clarified action-tool file size caps and workspace mode usage.

Development

git clone https://github.com/Hmbown/aleph.git
cd aleph
pip install -e ".[dev,mcp]"
pytest

See DEVELOPMENT.md for architecture details.

References

Aleph implements the Recursive Language Model (RLM) architecture described in:

Recursive Language Models Zhang, A. L., Kraska, T., & Khattab, O. (2025) arXiv:2512.24601

RLMs treat the input context as an external environment variable rather than part of the prompt. This allows models to programmatically decompose inputs, recursively query themselves over chunks, and synthesize results—processing inputs far beyond their native context window.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aleph_rlm-0.5.6.tar.gz (112.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aleph_rlm-0.5.6-py3-none-any.whl (80.1 kB view details)

Uploaded Python 3

File details

Details for the file aleph_rlm-0.5.6.tar.gz.

File metadata

  • Download URL: aleph_rlm-0.5.6.tar.gz
  • Upload date:
  • Size: 112.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aleph_rlm-0.5.6.tar.gz
Algorithm Hash digest
SHA256 5314009f6f92e5868bf3bddf6cab9c1d1f9b7add50d3e712813b9e7d1f42b83c
MD5 e49938bc7a3793e4825320d53cafce86
BLAKE2b-256 723ae6ac1a0def7fb832220d5d3189db74c858fde46d5b3298fe3ec48284d557

See more details on using hashes here.

Provenance

The following attestation bundles were made for aleph_rlm-0.5.6.tar.gz:

Publisher: publish.yml on Hmbown/aleph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aleph_rlm-0.5.6-py3-none-any.whl.

File metadata

  • Download URL: aleph_rlm-0.5.6-py3-none-any.whl
  • Upload date:
  • Size: 80.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aleph_rlm-0.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1ea9f192040d390af5b0369c1d3629f5c8004fc95a312cd3ce9464028ce0e73c
MD5 3e5e9b4db98d867d719469b1ed41c363
BLAKE2b-256 6eac502010aa0ed58d4b96935375223ce219c7d725ffa579386c1276ba92603f

See more details on using hashes here.

Provenance

The following attestation bundles were made for aleph_rlm-0.5.6-py3-none-any.whl:

Publisher: publish.yml on Hmbown/aleph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page