Skip to main content

Local RAG-based code review CLI. No API keys. Runs fully on your machine.

Project description

codereview

A local, privacy-first code review CLI tool powered by RAG and a local LLM. No API keys. No data leaves your machine.

pip install codereview-local
codereview your_file.py

How it works

Most code review tools send your code to a remote API. This one runs entirely on your machine.

It uses a RAG (Retrieval-Augmented Generation) pipeline to intelligently select the most relevant parts of your code before sending them to a local LLM for review. This means it scales to large codebases without hitting context window limits.

your code
    │
    ▼
tree-sitter parses into functions/classes
    │
    ▼
sentence-transformers converts chunks to vectors
    │
    ▼
ChromaDB stores all vectors in memory
    │
    ▼
semantic queries retrieve the most relevant chunks
("security vulnerabilities", "missing error handling", ...)
    │
    ▼
local LLM reviews only what matters
    │
    ▼
actionable feedback printed to terminal

Features

  • Fully local — runs on your machine, no API keys, no data sent anywhere
  • RAG pipeline — semantic retrieval finds the most relevant code across your entire project
  • AST-based chunking — splits by functions and classes using tree-sitter, not arbitrary character counts
  • Multi-query retrieval — five semantic queries cast different nets across your codebase
  • Any file type — works on Python, JavaScript, JSX, and anything else
  • Directory support — review an entire project at once
  • Streaming output — see the review as it generates, token by token
  • GPU accelerated — embedding model uses CUDA automatically if available

Requirements

  • Python 3.10+
  • Ollama installed and running
  • A coding model pulled in Ollama
ollama pull qwen3-coder:latest
# or a smaller/faster option:
ollama pull deepseek-coder:6.7b

Installation

pip install codereview-local

Or from source:

git clone https://github.com/Muhammad-NSQ/codereview
cd codereview
pip install -e .

Usage

Review a single file:

codereview path/to/file.py

Review an entire directory:

codereview path/to/project/

Use a different model:

codereview path/to/file.py --model deepseek-coder:6.7b

Example output

$ codereview app/auth.py

📂 Indexing app/auth.py...
   3 chunks indexed
🔎 Running semantic retrieval...
🤖 Reviewing with LLM...

## Critical Security Issues

**SQL Injection Vulnerability**
- Line 3: Direct string concatenation in SQL query
- Fix: Use parameterized queries: db.query("SELECT * FROM users WHERE id = ?", (id,))

**Hardcoded Credentials**
- Line 2: Database password exposed in plain text
- Fix: Use environment variables or a secrets manager

## Runtime Errors

**Division by Zero**
- Line 12: No check for b == 0 before division
- Fix: Add validation: if b == 0: raise ValueError("Cannot divide by zero")

## Bad Practices

**Resource Leak**
- Line 7: File handle opened but never closed
- Fix: Use context manager: with open(path) as f:

Tech stack

Component Library Purpose
CLI Typer Command line interface
AST parsing tree-sitter Split code by functions/classes
Embeddings sentence-transformers Convert code to vectors
Vector DB ChromaDB Store and search embeddings
LLM Ollama Local language model inference
HTTP requests Talk to Ollama API

Why RAG for code review?

The naive approach — dump the entire file into the LLM — breaks on large codebases. A 2000-line file with 80 functions easily exceeds most models' context windows.

The RAG approach — index everything, retrieve only what's relevant, send a focused context to the LLM. Five semantic queries target different problem categories:

  • Security vulnerabilities and injection attacks
  • Missing error handling and uncaught exceptions
  • Resource leaks and connection management
  • Bad practices and code smells
  • Input validation and type safety

All matching chunks from all files share one ChromaDB collection, so the retrieval competes across your entire codebase — not file by file.


Project structure

codereview/
├── codereview/
│   ├── __init__.py
│   ├── chunker.py      # tree-sitter AST parsing
│   ├── embedder.py     # sentence-transformers embeddings
│   ├── retriever.py    # ChromaDB storage and retrieval
│   ├── reviewer.py     # Ollama LLM integration
│   └── cli.py          # Typer CLI and pipeline orchestration
├── main.py
└── setup.py

Author

Muhammad — GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codereview_local-0.1.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codereview_local-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file codereview_local-0.1.0.tar.gz.

File metadata

  • Download URL: codereview_local-0.1.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for codereview_local-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9384e6ccdbb4ef386ddf33d287f8ff4ceee6842916cd60b303d30da0f4afac52
MD5 ce0e19a38792af99ff306356ddb8c69b
BLAKE2b-256 d5af6880994ee2fd581f5384195f1094744f5138f8fe75da9997bb21c6d4e3ff

See more details on using hashes here.

File details

Details for the file codereview_local-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for codereview_local-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cce7ce6796a2666a622dd07b8c1f75c9491c8149b10b78e4cbe771ebaed75344
MD5 e67a743f15740b01f296e4519af5e8c4
BLAKE2b-256 451c2130530939d0353ae2e96fccaced5bfd650bf1f42e0516474814ea19756b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page