Skip to main content

ReIN — harness your code. An open-source agentic coding runtime.

Project description

ReIN

Harness your code.

Quick Start · Architecture · Usage · 中文文档


ReIN is an open-source agentic coding runtime that implements a complete harness architecture — the control plane that orchestrates LLM calls, tool execution, hook lifecycle, permission control, and plugin systems.

It supports both Anthropic Claude (cloud) and local LLMs (Ollama, LM Studio, llama.cpp, vLLM) for fully offline operation.

Why ReIN?

Rein (n.) — a strap fastened to a bit, used to guide a horse. In software, the harness that guides an AI agent: intercepting, evaluating, executing, and extending every action it takes.

Most agentic coding tools are closed-source black boxes. ReIN opens up the full runtime:

  • See exactly how LLM tool calls are orchestrated
  • Hook into every lifecycle event (PreToolUse, PostToolUse, Stop, etc.)
  • Control permissions at 5 layers (admin → user → project → command → hook)
  • Extend with plugins (commands, agents, skills, hooks, MCP)
  • Run offline with any local model — no API key needed

Quick Start

Prerequisites

  • Python 3.11+
  • An LLM backend (choose one):
    • Anthropic API key, or
    • Ollama / LM Studio / any OpenAI-compatible local server

Install

git clone https://github.com/BDeMo/ReIN.git
cd ReIN
pip install -r requirements.txt

Run

# Cloud mode (Anthropic Claude)
export ANTHROPIC_API_KEY=sk-ant-xxx
python -m rein direct

# Fully offline (Ollama)
ollama pull qwen2.5-coder:7b
python -m rein direct --local

# Custom local server (LM Studio / llama.cpp / vLLM)
python -m rein direct --local --local-url http://localhost:1234/v1 --local-model my-model

Architecture

rein/
├── core/
│   ├── harness.py          Core orchestrator — the heart of ReIN
│   ├── config.py           Multi-layer settings hierarchy
│   └── conversation.py     Session and message management
├── llm/
│   ├── provider.py         Abstract LLM interface
│   ├── anthropic_llm.py    Anthropic Claude (streaming + tool use)
│   └── local_llm.py        Local LLM (Ollama / LM Studio / llama.cpp / vLLM)
├── tools/
│   ├── registry.py         Tool registry and base class
│   ├── file_tools.py       Read / Write / Edit
│   ├── bash_tool.py        Bash with command filtering and security
│   └── search_tools.py     Grep / Glob
├── hooks/
│   ├── engine.py           Hook execution engine (command + prompt based)
│   └── types.py            9 lifecycle event types
├── permissions/
│   └── manager.py          5-layer permission model (allow / deny / ask)
├── plugins/
│   └── loader.py           Plugin discovery and loading
├── server/
│   └── app.py              FastAPI server with WebSocket streaming
├── client/
│   └── cli.py              Terminal client (direct + server modes)
└── main.py                 CLI entry point

Harness Pipeline

Every tool call passes through the full harness pipeline:

User Input
  → [UserPromptSubmit Hook]      Validate / preprocess
  → LLM streaming response       Generate text + tool calls
  → Tool call detected
    → [PreToolUse Hook]          Validate / modify / block
    → [Permission Check]         5-layer allow / deny / ask
    → Tool Execution             Run the tool
    → [PostToolUse Hook]         React / log / feedback
  → LLM continues                Feed result back
  → [Stop Hook]                  Validate task completion

Hook Events

Event When Purpose
PreToolUse Before tool execution Validate, modify, or block
PostToolUse After tool execution React, log, feedback
Stop Before agent stops Verify task completion
UserPromptSubmit User sends message Input preprocessing
SessionStart / SessionEnd Session lifecycle Init / cleanup
PreCompact Before context compaction Preserve critical info
Notification Any notification Logging, monitoring
SubagentStop Subagent completes Validate subagent output

Permission Layers

Layer 1  managed-settings.json       Enterprise admin (MDM deployable)
Layer 2  ~/.claude/settings.json     User global preferences
Layer 3  .claude/settings.json       Project-level settings
Layer 4  YAML frontmatter            Command / Agent tool whitelist
Layer 5  PreToolUse Hook             Runtime dynamic decisions

Usage

Direct Mode (simplest)

# Anthropic Claude
python -m rein direct

# Local LLM (Ollama)
python -m rein direct --local --local-model qwen2.5-coder:7b

# Custom system prompt
python -m rein direct --system-prompt "You are a Python expert."

Server + Client

# Terminal 1: server
python -m rein server --port 8765

# Terminal 2: client
python -m rein client --url ws://localhost:8765/ws/chat

# Local LLM server
python -m rein server --local --local-model llama3.1:8b

API

Endpoint Method Description
/health GET Health check
/api/tools GET List tools with schemas
/api/settings GET Current settings
/api/chat POST Non-streaming chat
/ws/chat WebSocket Streaming chat (full harness)

WebSocket Protocol

// Client → Server
{"type": "message", "content": "Read main.py", "system_prompt": "..."}

// Server → Client (streamed)
{"type": "text_delta",    "data": {"text": "I'll read..."}}
{"type": "tool_use",      "data": {"id": "...", "name": "Read", "input": {...}}}
{"type": "tool_result",   "data": {"tool_use_id": "...", "result": "..."}}
{"type": "usage",         "data": {"input_tokens": 150, "output_tokens": 80}}
{"type": "turn_complete", "data": {"stop_reason": "end_turn"}}

Local LLM

ReIN supports two tool-use modes:

Mode How Models
Native OpenAI tool_call format qwen2.5, llama3.1, mistral, functionary
Prompt-based Schemas in prompt, parses ```tool_call blocks Any model

Auto-detected from model name. Force native with --native-tools.

Compatible Servers

Server Default URL Install
Ollama http://localhost:11434/v1 ollama serve
LM Studio http://localhost:1234/v1 GUI
llama.cpp http://localhost:8080/v1 ./llama-server -m model.gguf
vLLM http://localhost:8000/v1 vllm serve model
LocalAI http://localhost:8080/v1 Docker

Recommended Models

Model Size Tool Use Notes
qwen2.5-coder:7b 4.7 GB Native Best coding model at this size
qwen2.5-coder:1.5b 1.0 GB Native Fast, lightweight
llama3.1:8b 4.7 GB Native Strong general purpose
deepseek-coder-v2:16b 9.0 GB Prompt Excellent at code
codellama:7b 3.8 GB Prompt Meta's code model

Environment Variables

Variable Description Default
ANTHROPIC_API_KEY Anthropic API key
CLAUDE_MODEL Override model name claude-sonnet-4-20250514
ANTHROPIC_BASE_URL Override API URL
LOCAL_LLM_URL Local server URL http://localhost:11434/v1
LOCAL_LLM_MODEL Local model name qwen2.5-coder:7b

Dependencies

Package Purpose
anthropic Anthropic Claude API
httpx Async HTTP for local LLMs
fastapi API server
uvicorn ASGI server
websockets WebSocket client
pyyaml YAML parsing

Acknowledgements

ReIN is inspired by and built upon ideas from:

  • Anthropic — the Claude Code open-source plugin ecosystem and harness architecture
  • Ollama — making local LLMs accessible to everyone
  • FastAPI — elegant async Python web framework
  • llama.cpp — efficient local model inference
  • OpenAI — the tool-calling API convention adopted by local LLM servers

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rein_harness-0.1.0.tar.gz (40.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rein_harness-0.1.0-py3-none-any.whl (40.7 kB view details)

Uploaded Python 3

File details

Details for the file rein_harness-0.1.0.tar.gz.

File metadata

  • Download URL: rein_harness-0.1.0.tar.gz
  • Upload date:
  • Size: 40.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rein_harness-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7856e35bb5bbc5520f0d035633c9b7a12a1a85decd066a87949c71b39975d781
MD5 dc425029f500acea3c04e0b5a06fe307
BLAKE2b-256 bb86d10efbb262e4854c0df33e14ba4ac3ef3c86e4b32618da09193e6dd3597c

See more details on using hashes here.

Provenance

The following attestation bundles were made for rein_harness-0.1.0.tar.gz:

Publisher: workflow.yml on BDeMo/ReIN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rein_harness-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rein_harness-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 40.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rein_harness-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d89b2c7feeff46f548280dae957e408cca0de15535ba2b2fa1448bf0cbb73ba6
MD5 4d82fbd993036fef95e53e78012c3bf1
BLAKE2b-256 3b66967ae19d87247a0e8baf8dcb05463349ed613b5642bfaca9ecfe2a4c93b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for rein_harness-0.1.0-py3-none-any.whl:

Publisher: workflow.yml on BDeMo/ReIN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page