Recursive Language Models with DSPy + Modal and an integrated Web UI for secure long-context code execution
Project description
fleet-rlm
Secure, cloud-sandboxed Recursive Language Models (RLM) with DSPy and Modal.
fleet-rlm gives AI agents a secure cloud sandbox for long-context code and document work, with a Web UI-first experience, recursive delegation, and DSPy-aligned tooling.
Paper | Docs | Contributing
Quick Start
Install and launch the Web UI in under a minute:
# Option 1: install as a runnable tool
uv tool install fleet-rlm
fleet web
Or in your active environment:
# Option 2: regular environment install
uv pip install fleet-rlm
fleet web
Open http://localhost:8000 in your browser.
fleet web is the primary interactive interface. The published package already includes the built frontend assets, so end users do not need bun or a separate frontend toolchain.
What You Get
- Browser-first RLM chat (
fleet web) - Secure Modal-backed long-context execution for code/doc workflows
- WS-first runtime streaming for chat and execution events
- Runtime configuration and diagnostics from the Web UI settings
- Optional MCP server surface (
fleet-rlm serve-mcp)
Common Commands
# Standalone terminal chat
fleet-rlm chat --trace-mode compact
# Explicit API server
fleet-rlm serve-api --port 8000
# MCP server
fleet-rlm serve-mcp --transport stdio
# Scaffold assets for Claude Code
fleet-rlm init --list
Runtime Notes
- Product chat transport is WS-first (
/api/v1/ws/chat). - Runtime model updates from Settings are hot-applied in-process (
/api/v1/runtime/settings) and reflected on/api/v1/runtime/status. - Secret inputs in Runtime Settings are write-only.
Running From Source (Contributors)
# from repo root
uv sync --extra dev --extra server
uv run fleet web
uv run fastapi dev
For release/packaging workflows, uv build now runs frontend build sync automatically (requires bun in repo checkouts that include src/frontend).
Use full contributor setup and quality gates in AGENTS.md and CONTRIBUTING.md.
Architecture Overview
Read this after the quick start if you want the full system picture (entry points, ReAct orchestration, tools, Modal execution, persistent storage).
graph TB
subgraph entry ["🚪 Entry Points"]
CLI["fleet / fleet-rlm CLI"]
WebUI["Web UI<br/>(React SPA)"]
API["FastAPI<br/>(WS/REST)"]
TUI["Ink TUI<br/>(standalone runtime)"]
MCP["MCP Server"]
end
subgraph orchestration ["🧠 Orchestration Layer"]
Agent["RLMReActChatAgent<br/>(dspy.Module)"]
LMs["Planner / Delegate LMs"]
History["Chat History"]
Memory["Core Memory<br/>(Persona/Human/Scratchpad)"]
DocCache["Document Cache"]
end
subgraph tools ["🔧 ReAct Tools"]
DocTools["📄 load_document<br/>read_file_slice<br/>chunk_by_*"]
RecursiveTools["🔄 rlm_query<br/>llm_query<br/>(recursive delegation)"]
ExecTools["⚡ execute_code<br/>edit_file<br/>search_code"]
end
subgraph execution ["⚙️ Execution Layer"]
Interpreter["ModalInterpreter<br/>(JSON protocol)"]
Profiles["Execution Profiles:<br/>ROOT | DELEGATE | MAINTENANCE"]
end
subgraph cloud ["☁️ Cloud & Persistence"]
Sandbox["Modal Sandbox<br/>(Python REPL + Driver)"]
Volume[("💾 Modal Volume<br/>/data/<br/>• workspaces<br/>• docs/metadata")]
Neon[("🐘 Neon Postgres<br/>• runs / steps<br/>• artifacts<br/>• tenants")]
PostHog["📈 PostHog<br/>(LLM Observability)"]
end
WebUI -->|"WS / REST"| API
CLI --> Agent
API --> Agent
TUI --> Agent
MCP --> Agent
Agent --> LMs
Agent --> History
Agent --> Memory
Agent --> DocCache
Agent --> DocTools
Agent --> RecursiveTools
Agent --> ExecTools
API -.->|"Persistence"| Neon
Agent -.->|"Traces"| PostHog
DocTools --> Interpreter
RecursiveTools --> Interpreter
ExecTools --> Interpreter
Interpreter --> Profiles
Interpreter -->|"stdin/stdout<br/>JSON commands"| Sandbox
Sandbox -->|"read/write"| Volume
style entry fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
style orchestration fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
style tools fill:#fff3e0,stroke:#f57c00,stroke-width:2px
style execution fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
style cloud fill:#fce4ec,stroke:#c2185b,stroke-width:2px
Docs and Guides
- Documentation index
- Explanation index
- Quick install + setup
- Configure Modal
- Runtime settings (LM/Modal diagnostics)
- Deploying the server
- Using the MCP server
- CLI reference
- HTTP API reference
- Auth modes
- Database architecture
- Source layout
Advanced Features (Docs-First)
fleet-rlm also supports runtime diagnostics endpoints, WebSocket execution streams (/api/v1/ws/execution), multi-tenant Neon-backed persistence, and opt-in PostHog LLM analytics. Those workflows are documented in the guides/reference docs rather than front-loaded here.
Contributing
Contributions are welcome. Start with CONTRIBUTING.md, then use AGENTS.md for repo-specific commands and quality gates.
License
MIT License — see LICENSE.
Based on Recursive Language Modeling research by Alex L. Zhang (MIT CSAIL), Omar Khattab (Stanford), and Tim Kraska (MIT).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fleet_rlm-0.4.95.tar.gz.
File metadata
- Download URL: fleet_rlm-0.4.95.tar.gz
- Upload date:
- Size: 967.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5ede14a9ece587949712b79c93ef02614a1753de2ebe80b1c54e2500c68c3c9
|
|
| MD5 |
a88d4011f60cab9fe96d348694d61d9f
|
|
| BLAKE2b-256 |
97f6b8d0bfb03470361190beedba74808a8829e1d2568b9c264916dc98f3f552
|
Provenance
The following attestation bundles were made for fleet_rlm-0.4.95.tar.gz:
Publisher:
release.yml on Qredence/fleet-rlm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fleet_rlm-0.4.95.tar.gz -
Subject digest:
e5ede14a9ece587949712b79c93ef02614a1753de2ebe80b1c54e2500c68c3c9 - Sigstore transparency entry: 1038691984
- Sigstore integration time:
-
Permalink:
Qredence/fleet-rlm@a562a8033bc094f653bc9ec923a7c7b0cd8e4450 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Qredence
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a562a8033bc094f653bc9ec923a7c7b0cd8e4450 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file fleet_rlm-0.4.95-py3-none-any.whl.
File metadata
- Download URL: fleet_rlm-0.4.95-py3-none-any.whl
- Upload date:
- Size: 1.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8dbe0470608d422ee41a53f91441ab16ea7adb6227fde808876969153cd8b33e
|
|
| MD5 |
45e6b04080851ee402a941b9342d9fdb
|
|
| BLAKE2b-256 |
b7d0e07cb5a4a3ac716fe6980152a815ea25a54e34bb26153aee8c9a9c0b72b8
|
Provenance
The following attestation bundles were made for fleet_rlm-0.4.95-py3-none-any.whl:
Publisher:
release.yml on Qredence/fleet-rlm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fleet_rlm-0.4.95-py3-none-any.whl -
Subject digest:
8dbe0470608d422ee41a53f91441ab16ea7adb6227fde808876969153cd8b33e - Sigstore transparency entry: 1038692067
- Sigstore integration time:
-
Permalink:
Qredence/fleet-rlm@a562a8033bc094f653bc9ec923a7c7b0cd8e4450 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Qredence
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a562a8033bc094f653bc9ec923a7c7b0cd8e4450 -
Trigger Event:
workflow_dispatch
-
Statement type: