Skip to main content

Pre-ship risk critic (CLI + Python library) — surfaces breaking risk scenarios before they reach production

Project description

Gremlin

Pre-ship risk critic — surfaces what could break before it reaches production

PyPI CI Live Demo

Feed Gremlin a feature spec, PR diff, or plain English — it critiques it for blind spots using 107 curated "what if?" patterns across 14 domains, applied by Claude.

pip install gremlin-critic
gremlin review "checkout flow with Stripe"
🔴 CRITICAL (95%) — Webhook Race Condition
   What if the Stripe webhook arrives before the order record is committed?
   Impact: Payment captured but order not created.

🟠 HIGH (87%) — Double Submit on Payment Button
   What if the user clicks "Pay Now" twice rapidly?
   Impact: Potential duplicate charges.

Three ways to use it

1. CLI

# Review a feature
gremlin review "checkout flow"

# With context (diff, file, or string)
git diff | gremlin review "my changes" --context -
gremlin review "auth system" --context @src/auth/login.py

# Deep analysis, lower confidence threshold
gremlin review "payment refunds" --depth deep --threshold 60

# Learn from incidents
gremlin learn "Nav showed Login after auth" --domain auth --source prod

2. GitHub Action

Add to any repo — Gremlin posts a risk report on every PR automatically.

# .github/workflows/gremlin-review.yml
name: Gremlin Risk Review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - run: pip install gremlin-critic
      - run: git diff origin/${{ github.base_ref }}...HEAD > /tmp/pr-diff.txt
      - run: |
          python3 .github/scripts/gremlin_analyze.py \
            "${{ github.event.pull_request.title }}" \
            /tmp/pr-diff.txt /tmp/gremlin-report.json
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
      - uses: actions/github-script@v7
        with:
          script: |
            const data = JSON.parse(require('fs').readFileSync('/tmp/gremlin-report.json','utf8'));
            const risks = data.risks || [];
            const s = data.summary || {};
            const body = risks.length === 0
              ? '## Gremlin Risk Review\n\nNo risks above threshold.'
              : `## Gremlin Risk Review\n\n**${risks.length} risk(s)** — 🔴 ${s.critical||0} critical · 🟠 ${s.high||0} high · 🟡 ${s.medium||0} medium\n\n` +
                risks.map(r => `### ${r.severity}: ${r.title||r.scenario}\n**Confidence:** ${r.confidence}%\n\n${r.impact}`).join('\n\n---\n\n');
            github.rest.issues.createComment({issue_number: context.issue.number, owner: context.repo.owner, repo: context.repo.repo, body});

Set ANTHROPIC_API_KEY as a repository secret (Settings → Secrets → Actions). See the full script used in this repo.

3. Python API

from gremlin import Gremlin

g = Gremlin()
result = g.analyze("checkout flow", context="Using Stripe + Next.js")

# Check severity
if result.has_critical_risks():
    print(f"{result.critical_count} critical risks found")

# Output formats
result.to_json()         # JSON string
result.to_junit()        # JUnit XML for CI
result.format_for_llm()  # Concise format for agents

# Async
result = await g.analyze_async("payment processing")

# Block CI on critical risks
if result.has_critical_risks():
    sys.exit(1)

Risk Dashboard

Live visualization of Gremlin results applied to open-source projects — abhi10.github.io/gremlin

  • Heatmap · severity donut · domain bar chart · filterable risk table
  • Applied to celery, pydantic, and more

Pattern Domains

107 patterns across 14 domains — universal patterns run on every analysis, domain patterns trigger by keyword match:

Domain Keywords
payments checkout, stripe, billing, refund
auth login, session, token, oauth
database query, migration, transaction
concurrency async, queue, race, lock
infrastructure deploy, config, cert, secret
file_upload upload, image, file, cdn
api endpoint, rate limit, webhook
+ 7 more ...

Custom patterns

# .gremlin/patterns.yaml — auto-loaded per project
domain_specific:
  image_processing:
    keywords: [image, resize, cdn]
    patterns:
      - "What if EXIF rotation is ignored during resize?"

Performance

90.7% tie rate vs. baseline Claude Sonnet across 54 real-world test cases — patterns match raw LLM quality while adding domain-specific coverage.

Metric Result
Win / Tie Rate 98.1%
Gremlin Wins 7.4% — patterns caught risks Claude missed
Pattern Count 107 across 14 domains

Installation

pip install gremlin-critic
export ANTHROPIC_API_KEY=sk-ant-...

Supports: Anthropic (default) · OpenAI · Ollama (local, no API key needed)

g = Gremlin(provider="ollama", model="llama3")  # fully local

For development:

git clone https://github.com/abhi10/gremlin.git
pip install -e ".[dev]"
pytest

Commands

Command Description
gremlin review "scope" Analyze a feature for risks
gremlin review "scope" --context @file With file context
git diff | gremlin review "changes" --context - With diff via stdin
gremlin patterns list Show all pattern domains
gremlin patterns show payments Show patterns for a domain
gremlin learn "incident" --domain auth Learn from incidents

review options: --depth quick|deep · --threshold 0-100 · --output rich|md|json · --validate


License

MIT · Powered by Claude · Inspired by exploratory testing principles from James Bach and James Whittaker

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gremlin_critic-0.2.2.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gremlin_critic-0.2.2-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file gremlin_critic-0.2.2.tar.gz.

File metadata

  • Download URL: gremlin_critic-0.2.2.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for gremlin_critic-0.2.2.tar.gz
Algorithm Hash digest
SHA256 c33a12e3437db9c223d6ace4a0c2c94a5b68cd7807a436928f527058e9b0b932
MD5 064aa233822657654c217b26c161256c
BLAKE2b-256 54e2a6b20adeb4dce2bcc724c4088b81c4ca73162eb574868cbff2f10016fd71

See more details on using hashes here.

File details

Details for the file gremlin_critic-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: gremlin_critic-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for gremlin_critic-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 809738fbee3064d0fade593d350bbc0f76e13f7220817ac8fefb0c5c92bfa8b6
MD5 73a8d16810de031d97d7d6f38b556ae7
BLAKE2b-256 e0dff2b2d5bd37fd94c10d76190f639fe37fee82049d3cb2b92a8d24d21c64f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page