AI code review agent with personality — security-focused MCP server and CI action, built on Agno

These details have not been verified by PyPI

Project description

Grippy Code Review

Open-source AI code review agent. Your model, your infrastructure, your rules.

Grippy reviews pull requests using any OpenAI-compatible model — GPT, Claude, or a local LLM running on your own hardware. It indexes your codebase into a vector store for context-aware analysis, then posts structured findings with scores, verdicts, and escalation paths. It also happens to be a grumpy security auditor who secretly respects good code.

Why Grippy?

Your model, your infrastructure. Bring your own model. No SaaS dependency, no per-seat fees. Run GPT-5 through OpenAI, Claude through a compatible proxy, or a local model via Ollama or LM Studio.
Codebase-aware, not diff-blind. Grippy embeds your repository into a LanceDB hybrid search index (vector + full-text) and searches it during review. It understands the code around the diff, not just the diff itself. Most OSS alternatives paywall this behind a hosted tier.
Cross-PR memory, not amnesia. Grippy builds a knowledge graph of your codebase — tracking files, reviews, findings, and import dependencies across every PR. It knows which modules are blast-radius risks, which files have recurring findings, and which authors have patterns worth watching. Tools like CodeRabbit, Greptile, and Qodo charge $20–38/seat/month for comparable cross-PR context. Here, it's free and open-source.
Structured output, not just comments. Every review produces typed findings with severity, confidence, and category. A score out of 100. A verdict (PASS / FAIL / PROVISIONAL). Escalation targets for findings that need human attention.
Security-first, not security-added. Grippy is a security auditor that also reviews code, not the other way around. Dedicated audit modes go deeper than a general-purpose linter.
Deterministic rules, not just LLM guesses. A built-in rule engine runs 10 security rules against every diff before the LLM sees it. Findings are guaranteed — not hallucinated — and the profile gate can fail CI on critical severity hits, independent of model output.
MCP server — use Grippy as a local diff auditor from Claude Code, Cursor, or Claude Desktop via the Model Context Protocol.
It has opinions. Grippy is a grumpy security auditor persona, not a faceless bot. Good code gets grudging respect. Bad code gets disappointment. The personality keeps reviews readable and honest.

What it looks like

An inline finding on a PR diff:

CRITICAL | security | confidence: 95

SQL injection via string interpolation

query = f"SELECT * FROM users WHERE id = {user_id}" constructs a SQL query from unsanitized input. Use parameterized queries.

grippy_note: I've seen production databases get wiped by less. Parameterize it or I'm telling the security team.

A review summary posted as a PR comment:

Score: 45/100 | Verdict: FAIL | Complexity: STANDARD

3 findings (1 critical, 1 high, 1 medium) | 1 escalation to security-team

"I've reviewed thousands of PRs. This one made me mass in-progress a packet of antacids."

Quick start

GitHub Actions (OpenAI)

Add .github/workflows/grippy-review.yml to your repo:

name: Grippy Review

on:
  pull_request:
    types: [opened, synchronize, reopened]

permissions:
  contents: read
  pull-requests: write

jobs:
  review:
    name: Grippy Code Review
    runs-on: ubuntu-latest
    steps:
      - uses: step-security/harden-runner@a90bcbc6539c36a85cdfeb73f7e2f433735f215b  # v2.15.0
        with:
          egress-policy: audit

      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6

      - uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405  # v6
        with:
          python-version: '3.12'

      - name: Install Grippy
        run: pip install "grippy-mcp"

      # Cache the vector index to avoid re-indexing on every push
      - name: Cache Grippy data
        uses: actions/cache@cdf6c1fa76f9f475f3d7449005a359c84ca0f306  # v5
        with:
          path: ./grippy-data
          key: grippy-${{ github.event.pull_request.number }}-${{ github.sha }}
          restore-keys: grippy-${{ github.event.pull_request.number }}-

      - name: Run review
        id: review
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITHUB_EVENT_PATH: ${{ github.event_path }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
          GRIPPY_MODEL_ID: gpt-4.1
        run: grippy

Want LLM-only review without the rule engine? Set GRIPPY_PROFILE: general. For stricter gating (fail on WARN+), use strict-security. See examples/ for more workflow variants.

GitHub Actions (self-hosted LLM)

Grippy works with any OpenAI-compatible API endpoint, including Ollama, LM Studio, and vLLM. We recommend Devstral-Small 24B at Q4 quantization or higher — tested during development for structured output compliance and review quality. See the Self-Hosted LLM Guide for full setup instructions.

Local development

# OpenAI (default, included in base install)
pip install "grippy-mcp"

# Anthropic
pip install "grippy-mcp[anthropic]"

# Google (Gemini)
pip install "grippy-mcp[google]"

# Groq
pip install "grippy-mcp[groq]"

# Mistral
pip install "grippy-mcp[mistral]"

# Or with uv
uv add "grippy-mcp[anthropic]"

MCP Server

Quick start (zero install)

uvx grippy-mcp serve

Or install globally:

pip install grippy-mcp
grippy serve

Grippy runs as an MCP server for local git diff auditing — no GitHub Actions required.

Two tools:

Tool	What it does	LLM required?
`scan_diff`	Deterministic security rules	No
`audit_diff`	Full AI-powered code review	Yes

Scope options (both tools):

"staged" — staged changes (git diff --cached)
"commit:<ref>" — a specific commit (e.g. "commit:HEAD")
"range:<base>..<head>" — commit range (e.g. "range:main..HEAD")

Install into your MCP client:

python -m grippy install-mcp          # registers uvx grippy-mcp in client configs
python -m grippy install-mcp --dev    # dev mode: uses uv run --directory

The installer detects Claude Code, Claude Desktop, and Cursor, then writes the server config with your chosen LLM transport and API keys.

Run the server directly:

python -m grippy serve

MCP tools return dense, structured JSON designed for AI agent consumption — no personality or ASCII art.

Configuration

Grippy is configured entirely through environment variables.

Variable	Purpose	Default
`GRIPPY_TRANSPORT`	API transport: `openai`, `anthropic`, `google`, `groq`, `mistral`, or `local`	`local`
`GRIPPY_MODEL_ID`	Model identifier	`devstral-small-2-24b-instruct-2512`
`GRIPPY_BASE_URL`	API endpoint for local transport	`http://localhost:1234/v1`
`GRIPPY_EMBEDDING_MODEL`	Embedding model name	`text-embedding-qwen3-embedding-4b`
`GRIPPY_API_KEY`	API key for non-OpenAI endpoints	`lm-studio`
`GRIPPY_DATA_DIR`	Persistence directory	`./grippy-data`
`GRIPPY_TIMEOUT`	Review timeout in seconds (0 = none)	`300`
`GRIPPY_PROFILE`	Security profile: `security`, `strict-security`, `general`	`security`
`GRIPPY_MODE`	Review mode override	`pr_review`
`OPENAI_API_KEY`	OpenAI API key (sets transport to `openai`)	—
`GITHUB_TOKEN`	GitHub API token (set automatically by Actions)	—

Cross-vendor model selection

If your codebase is co-developed with an AI coding assistant, we strongly recommend running Grippy on a model from a different vendor than the one that wrote the code. Different model families have different training data, different biases, and different blind spots. A reviewer that shares the same priors as the author is more likely to miss the same classes of bugs. Using a cross-vendor model — for example, reviewing GPT-authored code with Claude, or Claude-authored code with GPT — gives you a genuinely independent audit rather than an echo chamber.

Security profiles

Grippy ships with the deterministic rule engine on by default (security profile). Ten rules scan every diff for secrets, dangerous sinks, workflow permission escalation, path traversal, unsanitized LLM output, risky CI scripts, SQL injection, weak cryptography, hardcoded credentials, and insecure deserialization — before the LLM sees anything. These findings are guaranteed, not hallucinated.

Switch profiles via GRIPPY_PROFILE env var or --profile CLI flag (CLI takes priority).

Profile	What happens	Gate behavior	When to use
`security` (default)	Rules + LLM review	CI fails on ERROR or CRITICAL rule findings	Most teams — catches real issues without noise
`strict-security`	Rules + LLM review	CI fails on WARN or higher	High-assurance, compliance, external contributors
`general`	LLM review only	No rule gate	When you only want AI-powered review, no deterministic scanning

# Use the default (security)
grippy

# Explicit profile
grippy --profile strict-security

# Via environment variable
GRIPPY_PROFILE=general grippy

The 10 deterministic rules:

Rule ID	Detects	Severity
`workflow-permissions-expanded`	write/admin permissions, unpinned actions	ERROR / WARN
`secrets-in-diff`	API keys, private keys, `.env` additions	CRITICAL / WARN
`dangerous-execution-sinks`	unsafe code execution patterns	ERROR
`path-traversal-risk`	tainted path variables, `../` patterns	WARN
`llm-output-unsanitized`	model output piped to sinks without sanitizer	ERROR
`ci-script-execution-risk`	risky CI script patterns, sudo in CI	CRITICAL / WARN
`sql-injection-risk`	SQL queries built from interpolated input	ERROR
`weak-crypto`	MD5, SHA1, DES, ECB mode, insecure RNG	WARN
`hardcoded-credentials`	passwords, connection strings, auth headers	ERROR
`insecure-deserialization`	unsafe deserialization sinks (shelve, dill, etc.)	ERROR

Rule findings are injected into the LLM context as confirmed facts for explanation.

When the knowledge graph is available (CI with caching, or MCP with persistent GRIPPY_DATA_DIR), rule findings are enriched with:

Blast radius — how many modules depend on the flagged file
Recurrence — whether this rule has fired on this file in prior reviews
False positive suppression — import-aware suppression (e.g., SQL injection suppressed when file imports SQLAlchemy)
Finding velocity — how often this rule fires across recent reviews

Suppression

`.grippyignore` — file-level suppression

Create a .grippyignore file in your repo root to exclude files from review. Uses gitignore syntax (comments, negation, wildcards):

# Exclude generated code
vendor/
*.generated.py

# Exclude test fixtures that contain intentional anti-patterns
tests/test_rule_*.py

# But keep the hostile environment tests
!tests/test_hostile_environment.py

Excluded files are stripped from the diff before either the rule engine or the LLM sees them.

`# nogrip` — line-level pragma

Suppress deterministic rule findings on specific lines:

password = os.environ["DB_PASS"]  # nogrip
conn = f"postgres://{user}:{password}@host/db"  # nogrip: hardcoded-credentials
h = hashlib.md5(data)  # nogrip: weak-crypto, hardcoded-credentials

Bare # nogrip suppresses all rules on that line
# nogrip: rule-id suppresses only the named rule
# nogrip: id1, id2 suppresses multiple rules
Rules only — the LLM reviewer still sees the line and may comment on it

Review modes

Mode	Trigger	Focus
`pr_review`	Default on PR events	Full code review: correctness, security, style, maintainability
`security_audit`	Manual, scheduled, or auto when `profile != general`	Deep security analysis: injection, auth, cryptography, data exposure
`governance_check`	Manual or scheduled	Compliance and policy: licensing, access control, audit trails
`surprise_audit`	PR title/body contains "production ready"	Full-scope audit with expanded governance checks
`cli`	Local invocation	Interactive review for local development and testing
`github_app`	GitHub App webhook	Event-driven review via installed GitHub App

GitHub Actions outputs

When running as a GitHub Action, Grippy sets these step outputs for downstream workflow logic:

Output	Type	Description
`score`	int	Review score 0–100
`verdict`	string	`PASS` / `FAIL` / `PROVISIONAL`
`findings-count`	int	Total LLM finding count
`merge-blocking`	bool	Whether verdict blocks merge
`rule-findings-count`	int	Deterministic rule hit count
`rule-gate-failed`	bool	Whether rule gate caused CI failure
`profile`	string	Active security profile name

Security

Grippy operates in an adversarial environment — PR diffs are untrusted input controlled by any contributor. Defense-in-depth sanitization is applied at every stage of the pipeline, validated by a 44-test adversarial test suite covering 9 attack domains.

Input sanitization. All untrusted text (PR metadata, diffs, tool outputs) passes through navi-sanitize for Unicode normalization — stripping invisible characters (ZWSP, bidi overrides, variation selectors), normalizing homoglyphs (Cyrillic/Greek → ASCII), and removing null bytes. This runs before any other processing.

Prompt injection defense. Three layers protect the LLM context:

XML escaping — All context sections (<diff>, <pr_metadata>, <rule_findings>, etc.) are XML-escaped, preventing </diff><system>... breakout attacks.
NL injection pattern neutralization — Seven compiled regex patterns detect and replace natural-language injection attempts (scoring directives, confidence manipulation, system override phrases) with [BLOCKED] markers.
Data-fence boundary — A preamble in the LLM prompt explicitly marks all subsequent content as "USER-PROVIDED DATA only" with instructions to ignore embedded directives.

Output sanitization. LLM-generated text passes through a five-stage pipeline before posting to GitHub:

navi-sanitize — Unicode normalization (same as input stage).
nh3 — Rust-based HTML sanitizer strips all HTML tags from free-text fields.
Markdown image stripping — Removes ![](url) syntax to prevent tracking pixels in review comments.
Markdown link rewriting — Converts [text](https://url) to plain text to prevent phishing links.
URL scheme filter — Removes javascript:, data:, and vbscript: schemes from remaining link syntax.

Tool output sanitization. Codebase tool responses (read_file, grep_code, list_files) are sanitized with navi-sanitize and XML-escaped before reaching the LLM, preventing indirect prompt injection through crafted file contents.

Adversarial test suite. tests/test_hostile_environment.py exercises 44 attack scenarios across Unicode attacks, prompt injection, tool exploitation, output sanitization gaps, information leakage, schema validation attacks, session history poisoning, and more. All 44 pass.

See the Security Model for codebase tool protections, CI hardening, and the full threat model.

Retrieval Quality Benchmarks

Grippy includes a benchmark suite for validating search and graph retrieval quality.

# Run search benchmarks (requires embedding model)
python -m benchmarks search --k 5

# Run graph retrieval benchmarks (requires populated graph DB)
python -m benchmarks graph

# Run all benchmarks
python -m benchmarks all

Results are written as JSON to benchmarks/output/.

Documentation

Getting Started — Setup for OpenAI, local LLMs, and development
Configuration — Environment variables and model options
Architecture — Module map, prompt system, data flow
Review Modes — The 6 review modes and how they work
Scoring Rubric — How Grippy scores PRs
Security Model — Codebase tool protections, hardened CI
Self-Hosted LLM Guide — Ollama/LM Studio + Cloudflare Tunnel
Contributing — Dev setup, testing, conventions
Examples — Copy-paste workflow YAMLs and sample review output
Changelog — Release history

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grippy_mcp-0.1.0.tar.gz (514.9 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grippy_mcp-0.1.0-py3-none-any.whl (140.6 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file grippy_mcp-0.1.0.tar.gz.

File metadata

Download URL: grippy_mcp-0.1.0.tar.gz
Upload date: Mar 9, 2026
Size: 514.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grippy_mcp-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`219a4370bbb7af9820058c35d945246672f6e8859567fc0fd45fef9bb03c9086`
MD5	`db3cdb9844c8b475f69941315f369d2e`
BLAKE2b-256	`f00e548c1378456b18348bdff3567c3f23aa5791b47f96c701a5333ca0935be6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grippy_mcp-0.1.0.tar.gz:

Publisher: release.yml on Project-Navi/grippy-code-review

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grippy_mcp-0.1.0.tar.gz
- Subject digest: 219a4370bbb7af9820058c35d945246672f6e8859567fc0fd45fef9bb03c9086
- Sigstore transparency entry: 1065743152
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: Project-Navi/grippy-code-review@b06302fa9a1dd5e248dda5f14d4b153f4b18bf4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Project-Navi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b06302fa9a1dd5e248dda5f14d4b153f4b18bf4c
- Trigger Event: release

File details

Details for the file grippy_mcp-0.1.0-py3-none-any.whl.

File metadata

Download URL: grippy_mcp-0.1.0-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 140.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grippy_mcp-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ba8d20139e8ce3a80a85abd9cab09fa5b7ba12d0512d3fdcc21b1e078f94f83`
MD5	`63ae81dcc28fc97ae1c202ba98cf378a`
BLAKE2b-256	`2227585667aefacdc1bf71b3eda23f5de90970d8028cfa0b94516e371ade0f6c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grippy_mcp-0.1.0-py3-none-any.whl:

Publisher: release.yml on Project-Navi/grippy-code-review

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grippy_mcp-0.1.0-py3-none-any.whl
- Subject digest: 9ba8d20139e8ce3a80a85abd9cab09fa5b7ba12d0512d3fdcc21b1e078f94f83
- Sigstore transparency entry: 1065743159
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: Project-Navi/grippy-code-review@b06302fa9a1dd5e248dda5f14d4b153f4b18bf4c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Project-Navi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b06302fa9a1dd5e248dda5f14d4b153f4b18bf4c
- Trigger Event: release

grippy-mcp 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Grippy Code Review

Why Grippy?

What it looks like

Quick start

GitHub Actions (OpenAI)

GitHub Actions (self-hosted LLM)

Local development

MCP Server

Quick start (zero install)

Configuration

Cross-vendor model selection

Security profiles

Suppression

.grippyignore — file-level suppression

# nogrip — line-level pragma

Review modes

GitHub Actions outputs

Security

Retrieval Quality Benchmarks

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`.grippyignore` — file-level suppression

`# nogrip` — line-level pragma