AI Code Vet / Auto-Fixer - Local CLI that vets AI-generated code in a Docker sandbox with Ollama-powered test generation and auto-fixing

These details have not been verified by PyPI

Project description

codevet

AI Code Vet / Auto-Fixer — Catch the bugs AI coding tools miss.

AI tools like Claude Code, Cursor, and Copilot generate code that's almost right — the worst kind of wrong. Subtle bugs, missed edge cases, security holes, and logic drifts that waste more time debugging than writing fresh.

codevet fixes this. One command vets any AI-generated file: it auto-generates targeted tests, runs them in a hardened Docker sandbox, auto-fixes failures with up to 3 LLM iterations, and gives you a confidence score. Everything runs locally — your code never leaves your machine.

What you get
Install for dummies (step by step)
First run
Configuration (codevet.yaml)
Picking the right model
CLI reference
How it works
Security model
Credits
Development
License

What you get

$ codevet fix app.py

Vetting app.py...
Found issues: 6 failed, 0 errors. Auto-fixing...
Iteration 1: 4/6 passing
Iteration 2: 6/6 passing — fix succeeded.

CodeVet Results for app.py
Model: qwen2.5-coder:14b | Duration: 47.3s

Tests: 6 passed, 0 failed (6 total)
Fix: Success in 2 iteration(s)

┌─ Confidence ─────────────────────────┐
│  A  92/100  (Excellent confidence)   │
└──────────────────────────────────────┘

┌─ Diff ───────────────────────────────┐
│ - def authenticate(user, pw):        │
│ -     query = f"SELECT * FROM ..."   │
│ + def authenticate(user, pw):        │
│ +     query = "SELECT * FROM ..."    │
│ +     cursor.execute(query, (user,)) │
└──────────────────────────────────────┘

Install for dummies (step by step)

Codevet has three external dependencies: Python, Docker, and Ollama. The steps below assume you have none of them installed. If you already have one, just skip that section.

1. Install Python 3.11+

Platform	Instructions
Windows	Download from python.org. During install, tick "Add python.exe to PATH".
macOS	`brew install python@3.11` (install Homebrew from brew.sh first if needed).
Linux	`sudo apt install python3.11 python3.11-venv` (Debian/Ubuntu) or `sudo dnf install python3.11` (Fedora).

Verify:

python --version          # should print 3.11.x or later

2. Install Docker

Codevet runs all generated tests inside a Docker container so they cannot touch your real machine. You need Docker installed and running.

Platform	Instructions
Windows	Install Docker Desktop. Launch it once and wait for the whale icon to turn solid in the system tray.
macOS	Install Docker Desktop, or `brew install --cask docker` then launch the app.
Linux	Follow the official guide for your distro: docs.docker.com/engine/install. Add yourself to the `docker` group so you don't need `sudo`: `sudo usermod -aG docker $USER` then log out and back in.

Verify:

docker version            # should print Server: Docker Engine ...
docker ps                 # should NOT print 'Cannot connect'

3. Install Ollama and pull a model

Ollama runs the LLM locally. Codevet talks to it over http://127.0.0.1:11434.

Platform	Instructions
Windows / macOS	Download the installer from ollama.com/download and run it. Ollama starts as a background service automatically.
Linux	`curl -fsSL https://ollama.com/install.sh \| sh`

Pull a model. The default is qwen2.5-coder:7b (4.7 GB, needs ~8 GB free RAM):

ollama pull qwen2.5-coder:7b

Verify:

ollama list               # should list qwen2.5-coder:7b

Don't know which model to pull? See Picking the right model below. The short answer: 7b for 16 GB laptops, 14b for desktops.

4. Install codevet

We recommend uv, which is faster than pip and handles the virtualenv for you. If you prefer pip, the same commands work without uv run.

# Clone the repo
git clone https://github.com/stevesolun/codevet.git
cd codevet

# Option A: with uv (recommended)
uv sync                                # installs codevet + deps in .venv
uv run codevet --help                  # verify it works

# Option B: with pip
python -m venv .venv
source .venv/bin/activate              # Windows: .venv\Scripts\activate
pip install -e .
codevet --help

That's it. Codevet is installed.

First run

Create a buggy Python file to test against:

# buggy.py
def authenticate(username, password):
    query = f"SELECT * FROM users WHERE name='{username}' AND pass='{password}'"
    return query

def get_element(lst, index):
    return lst[index + 1]

def calculate_average(numbers):
    return sum(numbers) / len(numbers)

Run codevet:

uv run codevet fix buggy.py

What happens:

Preflight check — codevet downloads llmfit (see Credits) and verifies your hardware can run the configured model. If the model is too large for your RAM, it will refuse to load.
Test generation — Ollama generates 6-12 targeted pytest cases (SQL injection probes, edge cases, performance traps).
Sandbox execution — Tests run inside the Docker sandbox. Codevet reports how many failed.
Auto-fix loop — If any tests failed, codevet asks the LLM for a fix and re-runs the tests. Up to 3 iterations.
Confidence score — A weighted blend of test pass-rate and LLM self-critique, graded A through F.

First run is slow. The Docker sandbox image (codevet-sandbox:0.1.0) is built locally on first invocation. Subsequent runs reuse the cached image.

Configuration (`codevet.yaml`)

Codevet ships with a fully commented example config at the repo root: codevet.yaml.example. Copy it next to your project to customize:

cp codevet.yaml.example codevet.yaml

…or generate a fresh copy anywhere with:

uv run codevet init-config              # writes ./codevet.yaml
uv run codevet init-config --user       # writes ~/.config/codevet/config.yaml

Where codevet looks for config

First match wins:

Priority	Location	When to use
1	`--config <path>` CLI flag	One-off override
2	`./codevet.yaml`	Project-specific settings
3	`~/.config/codevet/config.yaml`	User-global default
4	Built-in defaults	No config file at all

What you can configure

Every tunable lives in codevet.yaml.example. Highlights:

Setting	Default	What it does
`model`	`qwen2.5-coder:7b`	Ollama model for test gen + auto-fix
`image`	`python:3.11-slim`	Base Docker image for the sandbox
`timeout_seconds`	`120`	Per-sandbox-run hard timeout (clamped to [5, 1800])
`max_iterations`	`3`	Auto-fix loop cap (hard-capped at 3)
`mem_limit`	`256m`	Container memory cap
`pids_limit`	`64`	Max processes inside the container
`confidence_pass_weight`	`0.7`	Weight for the test pass-rate component of the confidence score
`confidence_critique_weight`	`0.3`	Weight for the LLM self-critique component (must sum to 1.0 with the above)

CLI flags always override config file values.

Picking the right model

The model is the single biggest factor in fix quality. Codevet runs a mandatory hardware-fit preflight before loading any model — if it won't fit your RAM, you'll get a clear error before anything tries to load.

Recommended models, by hardware tier:

Hardware	Model	Size	Fix quality	Notes
8 GB RAM, no GPU	`qwen2.5-coder:1.5b`	1.0 GB	⭐⭐	Fast, weak on security bugs
16 GB RAM, no GPU	`qwen2.5-coder:7b`	4.7 GB	⭐⭐⭐	Default — balanced
32 GB RAM, any GPU	`qwen2.5-coder:14b`	9.0 GB	⭐⭐⭐⭐	Best fix quality for most users
64 GB RAM + 16 GB+ GPU	`qwen2.5-coder:32b`	19 GB	⭐⭐⭐⭐⭐	Maximum quality, slow on CPU

Check any model against your hardware first:

uv run codevet preflight qwen2.5-coder:14b

CPU vs GPU timing (real numbers)

On a 16 GB laptop with no GPU acceleration:

Phase	qwen2.5-coder:7b	qwen2.5-coder:14b
Test generation (single Ollama call)	~4 min	~8 min
Each fix iteration	~5-7 min	~10-15 min
Sandbox execution	<2 sec	<2 sec
Full pipeline (gen + 3 iters + critique)	~25 min	~45 min

If you have a GPU with sufficient VRAM, divide these numbers by 5-10x.

CLI reference

codevet fix <file.py> [OPTIONS]
codevet init-config [OPTIONS]
codevet preflight <model>

`codevet fix`

Flag	Description	Default
`--config`, `-c`	Path to a `codevet.yaml` file (overrides discovery)	—
`--model`, `-m`	Override the configured Ollama model	from config
`--timeout`, `-t`	Override the sandbox timeout in seconds	from config
`--max-iterations`	Override the fix-loop cap	from config
`--image`	Override the Docker base image	from config
`--json`, `-j`	Emit machine-readable JSON instead of pretty output	`false`
`--diff`, `-d`	Read a unified diff from a file instead of a Python file	—
`--verbose`, `-v`	Show INFO-level phase logging	`false`
`--skip-preflight`	Skip the llmfit hardware-fit check (not recommended)	`false`

`codevet init-config`

Flag	Description	Default
`--user`	Write to `~/.config/codevet/config.yaml` instead of `./codevet.yaml`	`false`
`--force`, `-f`	Overwrite an existing file	`false`

`codevet preflight`

Verify a model will fit your hardware before configuring it. Powered by llmfit. See Credits.

codevet preflight qwen2.5-coder:14b

How it works

codevet fix app.py
       │
       ▼
┌────────────┐  ┌──────────┐  ┌────────────┐  ┌──────────┐  ┌─────────┐
│ preflight  │─▶│ vetter   │─▶│ sandbox    │─▶│ fixer    │─▶│ scorer  │
│ llmfit     │  │ Ollama   │  │ Docker     │  │ 3-iter   │  │ 0-100   │
│ hw check   │  │ test gen │  │ isolation  │  │ AST guard│  │ + grade │
└────────────┘  └──────────┘  └────────────┘  └──────────┘  └─────────┘

8 modules: cli · config · preflight · vetter · sandbox · fixer · scorer · models · prompts · utils

The auto-fix loop validates every LLM patch against Python's ast module before executing it — invalid syntax (e.g. markdown-fenced output) is caught and the iteration is skipped instead of crashing the run.

Security model

The Docker sandbox is the security boundary. Every container runs with:

--read-only filesystem
--network none — no internet access at all
--tmpfs /tmp:size=100m — ephemeral 100 MB scratch space
--security-opt no-new-privileges — drops the ability to escalate
--cap-drop=ALL — all Linux capabilities removed
--user 1001 — non-root execution
--memory 256m — memory cap
--pids-limit 64 — process count cap
Hard wall-clock timeout with forced container kill

Your project directory is mounted read-only. Generated tests cannot modify your source files.

Your code is analyzed locally. Nothing is uploaded. Ollama runs on your machine. The only network call codevet ever makes is the one-time download of the llmfit binary from its upstream GitHub release (cached locally afterwards).

llmfit binary trust model

On first run, codevet downloads the llmfit prebuilt binary from AlexsJones/llmfit releases — the official upstream repository. The binary is MIT-licensed and is never redistributed in this repo.

What codevet does to help you verify it:

The SHA-256 of the downloaded archive is computed and logged at INFO level:

INFO  llmfit archive SHA-256: <hex digest>
INFO  Verify at: https://github.com/AlexsJones/llmfit/releases (download checksum for <asset>)

Run with --verbose to see these lines in your terminal.
You can manually compare the logged digest against the release page on GitHub.

To skip the download entirely, use --skip-preflight:

codevet fix app.py --skip-preflight

The check (and binary download) is skipped completely. You lose the hardware-fit warning, but nothing else changes.

Credits

llmfit

Codevet's preflight hardware-fit check is powered by llmfit, an MIT-licensed Rust CLI by Alex Jones that detects host hardware (RAM, CPU, GPU) and scores LLM models against it.

Repo: https://github.com/AlexsJones/llmfit
Author: Alex Jones (@AlexsJones)
License: MIT

Codevet downloads the official prebuilt llmfit binary from upstream releases on first use and caches it locally. The binary is never redistributed in this repository — every install pulls it fresh and stays automatically up-to-date with the latest version (24-hour cache on the version lookup).

Without llmfit, codevet would have no good way to tell users in advance whether their chosen model will actually run on their machine. Huge thanks to Alex for building llmfit and making it MIT-licensed.

If you find llmfit useful, please star the upstream repo.

For full third-party credits, see NOTICE.md.

Development

# Clone and setup
git clone https://github.com/stevesolun/codevet.git
cd codevet
uv sync --dev

# Run the test suite (230 tests)
uv run pytest -q

# Type check
uv run mypy codevet/ --strict

# Lint
uv run ruff check .

Test coverage:

Suite	Count
Unit tests	195
Integration tests	25
End-to-end (real Docker + Ollama)	10
Total	230

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codevet-0.1.0.tar.gz (74.7 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codevet-0.1.0-py3-none-any.whl (43.9 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file codevet-0.1.0.tar.gz.

File metadata

Download URL: codevet-0.1.0.tar.gz
Upload date: Apr 8, 2026
Size: 74.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for codevet-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a523fd24750288e3293f1b40e47692cd124bc17793ea272acc3bfd934e84a58d`
MD5	`4a0a1c8647b1d90b6d58162494e7aa99`
BLAKE2b-256	`1e0d7094c5a7e3df74983ed86d57fcdcfac0362f4889460b03c319f4974fa737`

See more details on using hashes here.

File details

Details for the file codevet-0.1.0-py3-none-any.whl.

File metadata

Download URL: codevet-0.1.0-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 43.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for codevet-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`10c296afbfbe2e4b2610deb344ffa8ffd6f716bda1d7571856824a087b25e198`
MD5	`a0e07bee4695d3fae4d66f47ac714d44`
BLAKE2b-256	`688257c77cd08d12fc8ebcb6b441814eb3b718b1de4ed2fc6b9d1de700c371d5`

See more details on using hashes here.

codevet 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

codevet

Table of Contents

What you get

Install for dummies (step by step)

1. Install Python 3.11+

2. Install Docker

3. Install Ollama and pull a model

4. Install codevet

First run

Configuration (codevet.yaml)

Where codevet looks for config

What you can configure

Picking the right model

CPU vs GPU timing (real numbers)

CLI reference

codevet fix

codevet init-config

codevet preflight

How it works

Security model

llmfit binary trust model

Credits

llmfit

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Configuration (`codevet.yaml`)

`codevet fix`

`codevet init-config`

`codevet preflight`