Skip to main content

Error-driven code schema enforcement for LLMs writing Python

Project description

Gatehouse

PyPI version Python versions License: MIT

Error-driven code schema enforcement for LLMs writing Python.

Gatehouse validates Python files against structural rules and blocks non-compliant code before it runs. It's designed for agentic coding environments where LLMs write code — Cursor, Claude Code, Windsurf, Aider, or raw API prompts.

LLMs are unreliable at following instructions but extremely reliable at fixing errors. Gatehouse exploits this by turning your coding standards into deterministic error messages with exact fix instructions.

LLM writes code → Gatehouse blocks it → error says exactly what to fix
→ LLM fixes → Gatehouse checks again → compliant code runs

Install

pip install gatehouse

Requires Python 3.9+. The only dependency is pyyaml.


Quick Start

1. Set up the shim

Add to your shell profile (~/.bashrc or ~/.zshrc):

export GATE_HOME="$(pip show gatehouse | grep Location | cut -d' ' -f2)"
alias python="$GATE_HOME/python_gate"

Or if you installed from source:

export GATE_HOME="/path/to/gatehouse"
alias python="$GATE_HOME/python_gate"

2. Initialize a project

cd my-project
gatehouse init --schema production

This creates a single .gate_schema.yaml file in your project. Nothing else is added.

3. Write code normally

python src/train.py

If the code violates any rules, Gatehouse blocks execution and prints errors with fix instructions. If it passes, it runs normally.


What It Does

Given this code:

import torch

def train():
    learning_rate = 0.001
    for epoch in range(10):
        print(f"Epoch {epoch}")

Gatehouse blocks it:

  File "src/train.py", line 1
    import torch
  StructureError: Missing standard file header
  Fix: Add the following as the first lines of src/train.py:

        # ============================================================================
        # FILE: train.py
        # LOCATION: src/
        # PIPELINE POSITION: <describe where this fits>
        # PURPOSE: <one-line description>
        # ============================================================================

  File "src/train.py", line 4
        learning_rate = 0.001
                        ^^^^^
  HardcodedValueError: numeric literal `0.001` on line 4
  Fix: Move to a HYPERPARAMETERS block as LEARNING_RATE = 0.001

  Schema: production-ready-python (v1.0.0)
  Violations: 6 blocking, 0 warnings
  Execution: BLOCKED

Every error includes the file, line number, the offending code, what's wrong, and exactly how to fix it. The LLM reads these, fixes the code, and tries again.


How It Works

Gatehouse intercepts every python call at the OS level via a bash shim. The LLM can't bypass it because it doesn't know the shim exists — it only sees the errors.


Schemas

Schemas are rule sets. Pick one when initializing a project:

Schema Rules Use Case
production 10 rules, mostly blocking Production source code
exploration 2 rules, warnings only Scratch scripts, experiments
api Production + API route rules FastAPI / Flask services
minimal 1 rule, warning only Just catch hardcoded values
gatehouse init --schema exploration

Rules

Each rule is a single YAML file. The production schema includes:

Rule Checks For Severity
file-header Standard header block at top of file Block
module-docstring Module docstring with required sections Block
no-hardcoded-values Magic numbers buried in code Block
function-docstrings Docstring on every function Block
main-guard if __name__ == "__main__": guard Block
hyperparameter-block UPPER_SNAKE_CASE constants Block
max-file-length File length under 1000 lines Block
rich-progress Progress tracking on loops Warn
imports-present Import statements exist Warn

Block = code cannot run. Warn = warning shown, code still runs.


Customization

Override rules per project

Edit .gate_schema.yaml in your project root:

schema: "production"

rule_overrides:
  "main-guard":
    severity: "off"              # Disable a rule
  "function-docstrings":
    severity: "warn"             # Downgrade from block to warn
  "max-file-length":
    params:
      max_lines: 500             # Make stricter

overrides:
  "scripts/":
    schema: "exploration"        # Different schema for a folder
  "tests/":
    schema: null                 # No checking

Create custom rules

Using the CLI (interactive, no YAML knowledge needed):

gatehouse new-rule

Or manually — create a YAML file in the rules/ directory:

# rules/no-todo-comments.yaml
name: "No TODO Comments"
description: "Disallow TODO comments in production code"

check:
  type: "pattern_exists"
  pattern: "# TODO"
  location: "anywhere"

error:
  message: "StyleWarning: TODO comment found on line {line}"
  fix: "Resolve the TODO or move it to a tracking issue"

defaults:
  severity: "warn"
  enabled: true

Then reference it in a schema:

rules:
  - id: "no-todo-comments"

Built-in check types

Type What It Does
pattern_exists Match a regex or string at a location
ast_node_exists Check for an AST node (docstring, import, class)
ast_check Parameterized AST checks (all functions have docstrings, etc.)
token_scan Tokenizer-level scan (hardcoded literals, log calls)
uppercase_assignments_exist Module-level constant detection
docstring_contains Required sections in docstrings
file_metric Line count, function count, import count thresholds
custom Inline Python expression or external plugin file

Standalone Usage

Use Gatehouse with any LLM API in a validation loop:

import subprocess

result = subprocess.run(
    ["python3", "-m", "gate_engine", "--stdin",
     "--schema", ".gate_schema.yaml",
     "--filename", "src/train.py"],
    input=code_from_llm,
    capture_output=True, text=True
)

if result.returncode == 0:
    # Code passed — save it
    with open("src/train.py", "w") as f:
        f.write(code_from_llm)
else:
    # Code failed — feed errors back to the LLM
    errors = result.stderr

See examples/standalone_usage.py for a complete working example.


CLI Reference

gatehouse init --schema <name>          # Initialize a project
gatehouse list-rules                    # List all available rules
gatehouse list-rules --schema <name>    # List rules in a specific schema
gatehouse new-rule                      # Create a rule interactively
gatehouse test-rule <rule> <file>       # Test a rule against a file
gatehouse disable-rule <rule> --schema <name>  # Disable a rule in a schema

Telemetry

Every scan is logged to logs/gate/violations.jsonl in your project directory. Each line is a JSON object:

{
  "timestamp": "2026-02-14T14:57:00Z",
  "file": "src/train.py",
  "schema": "production",
  "status": "rejected",
  "violations": [{"rule": "file-header", "severity": "block", "line": 1}],
  "passed_rules": ["max-file-length", "imports-present"],
  "scan_ms": 7
}

Useful for tracking which rules get violated most, measuring LLM compliance over time, and generating fine-tuning data.


Architecture

gatehouse/
├── gate_engine.py          # Core engine — fixed runtime, never changes for rule changes
├── python_gate             # Bash shim — intercepts python calls at OS level
├── cli/                    # Interactive CLI for rule management
├── rules/                  # One YAML file per rule (12 built-in)
├── schemas/                # Schema manifests — assemble rules into sets
├── plugins/                # Custom check plugins (Python files)
└── examples/               # Example configs and usage scripts

The engine is fixed. All behavior comes from YAML rule files and schema manifests. Adding a rule means adding a YAML file. Removing a rule means deleting a line from a schema. The engine code never changes.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gatehouse-0.1.0.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gatehouse-0.1.0-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file gatehouse-0.1.0.tar.gz.

File metadata

  • Download URL: gatehouse-0.1.0.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for gatehouse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 427e224d00ba0d2736eebe24fb1e488948e9699b371c43581c12d6bb36244b3a
MD5 8bcf34ebf46f39864ac2c91cb232c6bf
BLAKE2b-256 0b88d0da61b4968b6d1584eb50cfc1196316a1f7b76d609c64fdf9b262d1ec09

See more details on using hashes here.

File details

Details for the file gatehouse-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gatehouse-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 33.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for gatehouse-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 244a40622c6927aa189277fdf4001d8db6d777949eabb85d8db5e3599edfbdd3
MD5 116cad0037ceef125fc56562f7540632
BLAKE2b-256 abd52f8a768fefee27469c83fdc1f61e955f38ac6672b19d389ffc0aa3e2fce0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page