ReIN — harness your code. An open-source agentic coding runtime.

Project description

ReIN

Harness your code.

Quick Start · Architecture · Usage · 中文文档

ReIN is an open-source agentic coding runtime that implements a complete harness architecture — the control plane that orchestrates LLM calls, tool execution, hook lifecycle, permission control, and plugin systems.

It supports both Anthropic Claude (cloud) and local LLMs (Ollama, LM Studio, llama.cpp, vLLM) for fully offline operation.

Why ReIN?

Rein (n.) — a strap fastened to a bit, used to guide a horse. In software, the harness that guides an AI agent: intercepting, evaluating, executing, and extending every action it takes.

Most agentic coding tools are closed-source black boxes. ReIN opens up the full runtime:

See exactly how LLM tool calls are orchestrated
Hook into every lifecycle event (PreToolUse, PostToolUse, Stop, etc.)
Control permissions at 5 layers (admin → user → project → command → hook)
Extend with plugins (commands, agents, skills, hooks, MCP)
Run offline with any local model — no API key needed

Quick Start

Prerequisites

Python 3.11+
An LLM backend (choose one):
- Anthropic API key, or
- Ollama / LM Studio / any OpenAI-compatible local server

Install

git clone https://github.com/BDeMo/ReIN.git
cd ReIN
pip install -r requirements.txt

Run

# Cloud mode (Anthropic Claude)
export ANTHROPIC_API_KEY=sk-ant-xxx
python -m rein direct

# Fully offline (Ollama)
ollama pull qwen2.5-coder:7b
python -m rein direct --local

# Custom local server (LM Studio / llama.cpp / vLLM)
python -m rein direct --local --local-url http://localhost:1234/v1 --local-model my-model

Architecture

rein/
├── core/
│   ├── harness.py          Core orchestrator — the heart of ReIN
│   ├── config.py           Multi-layer settings hierarchy
│   └── conversation.py     Session and message management
├── llm/
│   ├── provider.py         Abstract LLM interface
│   ├── anthropic_llm.py    Anthropic Claude (streaming + tool use)
│   └── local_llm.py        Local LLM (Ollama / LM Studio / llama.cpp / vLLM)
├── tools/
│   ├── registry.py         Tool registry and base class
│   ├── file_tools.py       Read / Write / Edit
│   ├── bash_tool.py        Bash with command filtering and security
│   └── search_tools.py     Grep / Glob
├── hooks/
│   ├── engine.py           Hook execution engine (command + prompt based)
│   └── types.py            9 lifecycle event types
├── permissions/
│   └── manager.py          5-layer permission model (allow / deny / ask)
├── plugins/
│   └── loader.py           Plugin discovery and loading
├── server/
│   └── app.py              FastAPI server with WebSocket streaming
├── client/
│   └── cli.py              Terminal client (direct + server modes)
└── main.py                 CLI entry point

Harness Pipeline

Every tool call passes through the full harness pipeline:

User Input
  → [UserPromptSubmit Hook]      Validate / preprocess
  → LLM streaming response       Generate text + tool calls
  → Tool call detected
    → [PreToolUse Hook]          Validate / modify / block
    → [Permission Check]         5-layer allow / deny / ask
    → Tool Execution             Run the tool
    → [PostToolUse Hook]         React / log / feedback
  → LLM continues                Feed result back
  → [Stop Hook]                  Validate task completion

Hook Events

Event	When	Purpose
`PreToolUse`	Before tool execution	Validate, modify, or block
`PostToolUse`	After tool execution	React, log, feedback
`Stop`	Before agent stops	Verify task completion
`UserPromptSubmit`	User sends message	Input preprocessing
`SessionStart` / `SessionEnd`	Session lifecycle	Init / cleanup
`PreCompact`	Before context compaction	Preserve critical info
`Notification`	Any notification	Logging, monitoring
`SubagentStop`	Subagent completes	Validate subagent output

Permission Layers

Layer 1  managed-settings.json       Enterprise admin (MDM deployable)
Layer 2  ~/.claude/settings.json     User global preferences
Layer 3  .claude/settings.json       Project-level settings
Layer 4  YAML frontmatter            Command / Agent tool whitelist
Layer 5  PreToolUse Hook             Runtime dynamic decisions

Usage

Direct Mode (simplest)

# Anthropic Claude
python -m rein direct

# Local LLM (Ollama)
python -m rein direct --local --local-model qwen2.5-coder:7b

# Custom system prompt
python -m rein direct --system-prompt "You are a Python expert."

Server + Client

# Terminal 1: server
python -m rein server --port 8765

# Terminal 2: client
python -m rein client --url ws://localhost:8765/ws/chat

# Local LLM server
python -m rein server --local --local-model llama3.1:8b

API

Endpoint	Method	Description
`/health`	GET	Health check
`/api/tools`	GET	List tools with schemas
`/api/settings`	GET	Current settings
`/api/chat`	POST	Non-streaming chat
`/ws/chat`	WebSocket	Streaming chat (full harness)

WebSocket Protocol

// Client → Server
{"type": "message", "content": "Read main.py", "system_prompt": "..."}

// Server → Client (streamed)
{"type": "text_delta",    "data": {"text": "I'll read..."}}
{"type": "tool_use",      "data": {"id": "...", "name": "Read", "input": {...}}}
{"type": "tool_result",   "data": {"tool_use_id": "...", "result": "..."}}
{"type": "usage",         "data": {"input_tokens": 150, "output_tokens": 80}}
{"type": "turn_complete", "data": {"stop_reason": "end_turn"}}

Local LLM

ReIN supports two tool-use modes:

Mode	How	Models
Native	OpenAI `tool_call` format	qwen2.5, llama3.1, mistral, functionary
Prompt-based	Schemas in prompt, parses ```tool_call blocks	Any model

Auto-detected from model name. Force native with --native-tools.

Compatible Servers

Server	Default URL	Install
Ollama	`http://localhost:11434/v1`	`ollama serve`
LM Studio	`http://localhost:1234/v1`	GUI
llama.cpp	`http://localhost:8080/v1`	`./llama-server -m model.gguf`
vLLM	`http://localhost:8000/v1`	`vllm serve model`
LocalAI	`http://localhost:8080/v1`	Docker

Recommended Models

Model	Size	Tool Use	Notes
`qwen2.5-coder:7b`	4.7 GB	Native	Best coding model at this size
`qwen2.5-coder:1.5b`	1.0 GB	Native	Fast, lightweight
`llama3.1:8b`	4.7 GB	Native	Strong general purpose
`deepseek-coder-v2:16b`	9.0 GB	Prompt	Excellent at code
`codellama:7b`	3.8 GB	Prompt	Meta's code model

Environment Variables

Variable	Description	Default
`ANTHROPIC_API_KEY`	Anthropic API key	—
`CLAUDE_MODEL`	Override model name	`claude-sonnet-4-20250514`
`ANTHROPIC_BASE_URL`	Override API URL	—
`LOCAL_LLM_URL`	Local server URL	`http://localhost:11434/v1`
`LOCAL_LLM_MODEL`	Local model name	`qwen2.5-coder:7b`

Dependencies

Package	Purpose
anthropic	Anthropic Claude API
httpx	Async HTTP for local LLMs
fastapi	API server
uvicorn	ASGI server
websockets	WebSocket client
pyyaml	YAML parsing

Acknowledgements

ReIN is inspired by and built upon ideas from:

Anthropic — the Claude Code open-source plugin ecosystem and harness architecture
Ollama — making local LLMs accessible to everyone
FastAPI — elegant async Python web framework
llama.cpp — efficient local model inference
OpenAI — the tool-calling API convention adopted by local LLM servers

License

MIT

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rein_harness-0.1.0.tar.gz (40.5 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rein_harness-0.1.0-py3-none-any.whl (40.7 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file rein_harness-0.1.0.tar.gz.

File metadata

Download URL: rein_harness-0.1.0.tar.gz
Upload date: Apr 13, 2026
Size: 40.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rein_harness-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7856e35bb5bbc5520f0d035633c9b7a12a1a85decd066a87949c71b39975d781`
MD5	`dc425029f500acea3c04e0b5a06fe307`
BLAKE2b-256	`bb86d10efbb262e4854c0df33e14ba4ac3ef3c86e4b32618da09193e6dd3597c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rein_harness-0.1.0.tar.gz:

Publisher: workflow.yml on BDeMo/ReIN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rein_harness-0.1.0.tar.gz
- Subject digest: 7856e35bb5bbc5520f0d035633c9b7a12a1a85decd066a87949c71b39975d781
- Sigstore transparency entry: 1287314759
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: BDeMo/ReIN@17cf050242ec84cccc92ba700d18c32e2e82dee7
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/BDeMo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: workflow.yml@17cf050242ec84cccc92ba700d18c32e2e82dee7
- Trigger Event: push

File details

Details for the file rein_harness-0.1.0-py3-none-any.whl.

File metadata

Download URL: rein_harness-0.1.0-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 40.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rein_harness-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d89b2c7feeff46f548280dae957e408cca0de15535ba2b2fa1448bf0cbb73ba6`
MD5	`4d82fbd993036fef95e53e78012c3bf1`
BLAKE2b-256	`3b66967ae19d87247a0e8baf8dcb05463349ed613b5642bfaca9ecfe2a4c93b9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rein_harness-0.1.0-py3-none-any.whl:

Publisher: workflow.yml on BDeMo/ReIN

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rein_harness-0.1.0-py3-none-any.whl
- Subject digest: d89b2c7feeff46f548280dae957e408cca0de15535ba2b2fa1448bf0cbb73ba6
- Sigstore transparency entry: 1287314841
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: BDeMo/ReIN@17cf050242ec84cccc92ba700d18c32e2e82dee7
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/BDeMo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: workflow.yml@17cf050242ec84cccc92ba700d18c32e2e82dee7
- Trigger Event: push

rein-harness 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ReIN

Why ReIN?

Quick Start

Prerequisites

Install

Run

Architecture

Harness Pipeline

Hook Events

Permission Layers

Usage

Direct Mode (simplest)

Server + Client

API

WebSocket Protocol

Local LLM

Compatible Servers

Recommended Models

Environment Variables

Dependencies

Acknowledgements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance