ReIN — harness your code. An open-source agentic coding runtime.
Project description
ReIN
Harness your code.
Quick Start · Architecture · Usage · 中文文档
ReIN is an open-source agentic coding runtime that implements a complete harness architecture — the control plane that orchestrates LLM calls, tool execution, hook lifecycle, permission control, and plugin systems.
It supports both Anthropic Claude (cloud) and local LLMs (Ollama, LM Studio, llama.cpp, vLLM) for fully offline operation.
Why ReIN?
Rein (n.) — a strap fastened to a bit, used to guide a horse. In software, the harness that guides an AI agent: intercepting, evaluating, executing, and extending every action it takes.
Most agentic coding tools are closed-source black boxes. ReIN opens up the full runtime:
- See exactly how LLM tool calls are orchestrated
- Hook into every lifecycle event (PreToolUse, PostToolUse, Stop, etc.)
- Control permissions at 5 layers (admin → user → project → command → hook)
- Extend with plugins (commands, agents, skills, hooks, MCP)
- Run offline with any local model — no API key needed
Quick Start
Prerequisites
- Python 3.11+
- An LLM backend (choose one):
Install
git clone https://github.com/BDeMo/ReIN.git
cd ReIN
pip install -r requirements.txt
Run
# Cloud mode (Anthropic Claude)
export ANTHROPIC_API_KEY=sk-ant-xxx
python -m rein direct
# Fully offline (Ollama)
ollama pull qwen2.5-coder:7b
python -m rein direct --local
# Custom local server (LM Studio / llama.cpp / vLLM)
python -m rein direct --local --local-url http://localhost:1234/v1 --local-model my-model
Architecture
rein/
├── core/
│ ├── harness.py Core orchestrator — the heart of ReIN
│ ├── config.py Multi-layer settings hierarchy
│ └── conversation.py Session and message management
├── llm/
│ ├── provider.py Abstract LLM interface
│ ├── anthropic_llm.py Anthropic Claude (streaming + tool use)
│ └── local_llm.py Local LLM (Ollama / LM Studio / llama.cpp / vLLM)
├── tools/
│ ├── registry.py Tool registry and base class
│ ├── file_tools.py Read / Write / Edit
│ ├── bash_tool.py Bash with command filtering and security
│ └── search_tools.py Grep / Glob
├── hooks/
│ ├── engine.py Hook execution engine (command + prompt based)
│ └── types.py 9 lifecycle event types
├── permissions/
│ └── manager.py 5-layer permission model (allow / deny / ask)
├── plugins/
│ └── loader.py Plugin discovery and loading
├── server/
│ └── app.py FastAPI server with WebSocket streaming
├── client/
│ └── cli.py Terminal client (direct + server modes)
└── main.py CLI entry point
Harness Pipeline
Every tool call passes through the full harness pipeline:
User Input
→ [UserPromptSubmit Hook] Validate / preprocess
→ LLM streaming response Generate text + tool calls
→ Tool call detected
→ [PreToolUse Hook] Validate / modify / block
→ [Permission Check] 5-layer allow / deny / ask
→ Tool Execution Run the tool
→ [PostToolUse Hook] React / log / feedback
→ LLM continues Feed result back
→ [Stop Hook] Validate task completion
Hook Events
| Event | When | Purpose |
|---|---|---|
PreToolUse |
Before tool execution | Validate, modify, or block |
PostToolUse |
After tool execution | React, log, feedback |
Stop |
Before agent stops | Verify task completion |
UserPromptSubmit |
User sends message | Input preprocessing |
SessionStart / SessionEnd |
Session lifecycle | Init / cleanup |
PreCompact |
Before context compaction | Preserve critical info |
Notification |
Any notification | Logging, monitoring |
SubagentStop |
Subagent completes | Validate subagent output |
Permission Layers
Layer 1 managed-settings.json Enterprise admin (MDM deployable)
Layer 2 ~/.claude/settings.json User global preferences
Layer 3 .claude/settings.json Project-level settings
Layer 4 YAML frontmatter Command / Agent tool whitelist
Layer 5 PreToolUse Hook Runtime dynamic decisions
Usage
Direct Mode (simplest)
# Anthropic Claude
python -m rein direct
# Local LLM (Ollama)
python -m rein direct --local --local-model qwen2.5-coder:7b
# Custom system prompt
python -m rein direct --system-prompt "You are a Python expert."
Server + Client
# Terminal 1: server
python -m rein server --port 8765
# Terminal 2: client
python -m rein client --url ws://localhost:8765/ws/chat
# Local LLM server
python -m rein server --local --local-model llama3.1:8b
API
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/api/tools |
GET | List tools with schemas |
/api/settings |
GET | Current settings |
/api/chat |
POST | Non-streaming chat |
/ws/chat |
WebSocket | Streaming chat (full harness) |
WebSocket Protocol
// Client → Server
{"type": "message", "content": "Read main.py", "system_prompt": "..."}
// Server → Client (streamed)
{"type": "text_delta", "data": {"text": "I'll read..."}}
{"type": "tool_use", "data": {"id": "...", "name": "Read", "input": {...}}}
{"type": "tool_result", "data": {"tool_use_id": "...", "result": "..."}}
{"type": "usage", "data": {"input_tokens": 150, "output_tokens": 80}}
{"type": "turn_complete", "data": {"stop_reason": "end_turn"}}
Local LLM
ReIN supports two tool-use modes:
| Mode | How | Models |
|---|---|---|
| Native | OpenAI tool_call format |
qwen2.5, llama3.1, mistral, functionary |
| Prompt-based | Schemas in prompt, parses ```tool_call blocks |
Any model |
Auto-detected from model name. Force native with --native-tools.
Compatible Servers
| Server | Default URL | Install |
|---|---|---|
| Ollama | http://localhost:11434/v1 |
ollama serve |
| LM Studio | http://localhost:1234/v1 |
GUI |
| llama.cpp | http://localhost:8080/v1 |
./llama-server -m model.gguf |
| vLLM | http://localhost:8000/v1 |
vllm serve model |
| LocalAI | http://localhost:8080/v1 |
Docker |
Recommended Models
| Model | Size | Tool Use | Notes |
|---|---|---|---|
qwen2.5-coder:7b |
4.7 GB | Native | Best coding model at this size |
qwen2.5-coder:1.5b |
1.0 GB | Native | Fast, lightweight |
llama3.1:8b |
4.7 GB | Native | Strong general purpose |
deepseek-coder-v2:16b |
9.0 GB | Prompt | Excellent at code |
codellama:7b |
3.8 GB | Prompt | Meta's code model |
Environment Variables
| Variable | Description | Default |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic API key | — |
CLAUDE_MODEL |
Override model name | claude-sonnet-4-20250514 |
ANTHROPIC_BASE_URL |
Override API URL | — |
LOCAL_LLM_URL |
Local server URL | http://localhost:11434/v1 |
LOCAL_LLM_MODEL |
Local model name | qwen2.5-coder:7b |
Dependencies
| Package | Purpose |
|---|---|
| anthropic | Anthropic Claude API |
| httpx | Async HTTP for local LLMs |
| fastapi | API server |
| uvicorn | ASGI server |
| websockets | WebSocket client |
| pyyaml | YAML parsing |
Acknowledgements
ReIN is inspired by and built upon ideas from:
- Anthropic — the Claude Code open-source plugin ecosystem and harness architecture
- Ollama — making local LLMs accessible to everyone
- FastAPI — elegant async Python web framework
- llama.cpp — efficient local model inference
- OpenAI — the tool-calling API convention adopted by local LLM servers
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rein_harness-0.1.0.tar.gz.
File metadata
- Download URL: rein_harness-0.1.0.tar.gz
- Upload date:
- Size: 40.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7856e35bb5bbc5520f0d035633c9b7a12a1a85decd066a87949c71b39975d781
|
|
| MD5 |
dc425029f500acea3c04e0b5a06fe307
|
|
| BLAKE2b-256 |
bb86d10efbb262e4854c0df33e14ba4ac3ef3c86e4b32618da09193e6dd3597c
|
Provenance
The following attestation bundles were made for rein_harness-0.1.0.tar.gz:
Publisher:
workflow.yml on BDeMo/ReIN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rein_harness-0.1.0.tar.gz -
Subject digest:
7856e35bb5bbc5520f0d035633c9b7a12a1a85decd066a87949c71b39975d781 - Sigstore transparency entry: 1287314759
- Sigstore integration time:
-
Permalink:
BDeMo/ReIN@17cf050242ec84cccc92ba700d18c32e2e82dee7 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/BDeMo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@17cf050242ec84cccc92ba700d18c32e2e82dee7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file rein_harness-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rein_harness-0.1.0-py3-none-any.whl
- Upload date:
- Size: 40.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d89b2c7feeff46f548280dae957e408cca0de15535ba2b2fa1448bf0cbb73ba6
|
|
| MD5 |
4d82fbd993036fef95e53e78012c3bf1
|
|
| BLAKE2b-256 |
3b66967ae19d87247a0e8baf8dcb05463349ed613b5642bfaca9ecfe2a4c93b9
|
Provenance
The following attestation bundles were made for rein_harness-0.1.0-py3-none-any.whl:
Publisher:
workflow.yml on BDeMo/ReIN
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rein_harness-0.1.0-py3-none-any.whl -
Subject digest:
d89b2c7feeff46f548280dae957e408cca0de15535ba2b2fa1448bf0cbb73ba6 - Sigstore transparency entry: 1287314841
- Sigstore integration time:
-
Permalink:
BDeMo/ReIN@17cf050242ec84cccc92ba700d18c32e2e82dee7 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/BDeMo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@17cf050242ec84cccc92ba700d18c32e2e82dee7 -
Trigger Event:
push
-
Statement type: