Skip to main content

Git-native AI agent co-harness — MCP server, static guards, and agentic evaluator for LLM-assisted coding

Project description

GitReins

Git-Native Agent Co-Harness — static guards + agentic evaluator for AI-assisted code

CI Python 3.10+ License: MIT PyPI

GitReins Banner

GitReins lives inside your git repository as a quality harness. It provides MCP tools for task lifecycle management, an agentic evaluator that judges code completeness against task definitions, and git hooks that ensure nothing bypasses the quality gates.

v0.7.0 — Verdict persistence, task dependencies, smart init, diff/full test modes, ~410 tests pass.


Quick Start

pip install gitreins
cd /path/to/your-project
gitreins install        # creates .gitreins/config.yaml + pre-commit hook
gitreins init           # smart init — detects language, size, optimal config

How It Works

  1. Create tasks — Define criteria via CLI or MCP tools
  2. Work with your AI agent — Claude, Hermes, Codex, or Pi does code generation
  3. Complete tasksgitreins task complete <id> triggers automatic evaluation
  4. Tier 1: Static guards — secrets, build, lint, tests (configurable)
  5. Tier 2: Agentic evaluator — LLM loop reads files, runs tests, delivers per-criterion PASS/FAIL
  6. Verdicts persisted — stored in .gitreins/history/, browsable via gitreins report
  7. Commit through harness — pre-commit hook runs guards, blocks if checks fail

Commands

gitreins install                      # Install hooks + config
gitreins init                         # Smart init (language, size, optimal config)
gitreins guard                        # Run Tier 1 static checks
gitreins report [-n N] [--interactive]  # Browse verdict history
gitreins task create <id> <title> [criteria...] [--depends-on ...]
gitreins task start <id>
gitreins task complete <id> [--force]
gitreins task list [--status pending|in_progress|complete]
gitreins task delete <id>
gitreins judge <id>                   # Evaluate a task
gitreins commit <message>             # Commit with guard checks
gitreins mcp-server                   # Run MCP stdio server (for AI agents)

Test Modes: full vs diff

GitReins supports two strategies for when tests run on commit, controlled by test_mode in .gitreins/config.yaml.

test_mode: "full" (default for new projects)

The entire test suite runs on every commit. Safe and thorough.

Best for:

  • New projects with a small, fast test suite
  • Projects where all tests pass reliably
  • When you want maximum safety on every commit

Tradeoff: Slow on large projects. Pre-existing failures in untouched code block unrelated commits.

guards:
  test_mode: "full"

test_mode: "diff" (recommended for mature projects)

Only tests for packages you actually changed. Uses basename mapping:

Changed file Test run
engine/guard_manager.py tests/test_guard_manager.py
gitreins/cli.py tests/test_cli.py
gitreins_mcp/server.py tests/test_mcp_server.py

Best for:

  • Projects with 5+ packages where full suite is slow
  • Projects with pre-existing test failures in untouched code
  • When you want fast feedback on the code you actually changed

Safety nets — diff mode falls back to full suite when:

  • pyproject.toml, .gitreins/config.yaml, Makefile, or setup.cfg changed
  • A test file itself changed (always included, plus its source-mapped siblings)
  • Changed files don't map to any known test files (unknown file = safety)
  • No staged files at all
  • Test command isn't pytest (custom runners can't be narrowed)

Tradeoff: Less safety on cross-cutting changes. Config changes always trigger full suite.

guards:
  test_mode: "diff"

Which mode should I use?

Project state Recommended mode
Brand new, <5 packages full
Mature, 5+ packages, tests pass diff
Mature, pre-existing test failures diff
Refactoring across packages full (temporarily)
CI / PR checks full (safety over speed)

Output examples

Full mode:

Tier 1 Guards: PASS  (test mode: full)
  ✓ secrets — clean
  ✓ lint — ok
  ✓ tests — passed

Diff mode (targeted):

Tier 1 Guards: PASS  (test mode: diff, 3 test file(s))
  ✓ secrets — clean
  ✓ tests — passed

Diff mode (safety trigger — full suite):

Tier 1 Guards: PASS  (test mode: diff, full suite — safety trigger)
  ✓ secrets — clean
  ✓ tests — passed

Verdict History

Every gitreins task complete and gitreins judge saves a verdict to .gitreins/history/. Configure in .gitreins/config.yaml:

history:
  enabled: true              # false = don't save verdicts
  storage: "git"             # "git" = auto-commit to gitreins branch
                             # "filesystem" = write files only, no git commits
  max_verdicts: 1000         # auto-prune old entries

Browse history:

gitreins report              # last 10 evaluations
gitreins report -n 20        # last 20
gitreins report --interactive  # TUI with arrow-key navigation (requires textual)

Task Dependencies

Tasks can depend on other tasks. Evaluation is blocked until dependencies pass:

gitreins task create build "Project builds" \
  "CGO_ENABLED=0 go build ./cmd/server exits 0"

gitreins task create api-crud "CRUD endpoints" --depends-on build \
  "POST /api/users creates a user" \
  "GET /api/users lists users"

gitreins task complete api-crud
# → "Cannot complete 'api-crud' — depends on: build"

gitreins task complete build      # complete the dependency first
gitreins task complete api-crud   # now this works

# Or force-skip dependency checks:
gitreins task complete api-crud --force

Configuration

Full .gitreins/config.yaml reference:

# ── Global defaults ──────────────────────────────────
defaults:
  model: deepseek-v4-flash
  max_iterations: 100
  check_for_updates: true

# ── Tier 1 guards ────────────────────────────────────
guards:
  secrets: true
  lint: true
  tests: true
  test_mode: "full"          # "full" or "diff"
  test_command: "pytest -x --tb=short"

  # Go projects (auto-detected via go.mod):
  go:
    build: true
    lint: true
    tests: true

# ── Tier 2 evaluator caps ────────────────────────────
evaluator:
  max_iterations: 25         # LLM reasoning turns
  max_time: "5m"             # wall clock cap
  max_input_tokens: "200k"
  max_output_tokens: "50k"
  tool_call_weight: 0.1      # tool calls cost 0.1 iterations

# ── Verdict history ──────────────────────────────────
history:
  enabled: true
  storage: "git"
  max_verdicts: 1000

Tech Stack

  • Language: Python 3.10+
  • Dependencies: mcp, pyyaml, requests, packaging (4 packages)
  • MCP Transport: stdio (26 tools)
  • Config: YAML in .gitreins/ directory
  • Evaluator Default Model: DeepSeek V4 Flash (~$0.01/eval)
  • Test suite: ~410 tests, real LLM integration tests included

Architecture & Docs

Document What it covers
Full Architecture System design and data flow
Component Map Module inventory with paths and line counts
Agentic Evaluator Design How the evaluator loop works

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitreins-0.7.8.tar.gz (154.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gitreins-0.7.8-py3-none-any.whl (85.0 kB view details)

Uploaded Python 3

File details

Details for the file gitreins-0.7.8.tar.gz.

File metadata

  • Download URL: gitreins-0.7.8.tar.gz
  • Upload date:
  • Size: 154.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gitreins-0.7.8.tar.gz
Algorithm Hash digest
SHA256 72d26452c14fb78b98f18151f9bcc341187b4f95853cb4af1605847a69536598
MD5 12312d615081c73d99424ca090f7ddd0
BLAKE2b-256 f24ccc55ea04da238bbd93e3b18e68f4ad836de900e9139b27a1e7c0ce811835

See more details on using hashes here.

File details

Details for the file gitreins-0.7.8-py3-none-any.whl.

File metadata

  • Download URL: gitreins-0.7.8-py3-none-any.whl
  • Upload date:
  • Size: 85.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gitreins-0.7.8-py3-none-any.whl
Algorithm Hash digest
SHA256 545ab297253635ed3713b5ea01a5743c54fc9cc06522156c891b05318e6db9a4
MD5 0eb11901b3bbf58cbcabcb3e7573cddc
BLAKE2b-256 2ab56a7dc8b8125bf39a37219128f35e1531a25a392c341230f9721cae906895

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page