Skip to main content

AI-powered code security scanner that finds vulnerabilities Semgrep and CodeQL miss

Project description

PyPI Python License Stars

VulnHawk

AI-powered code security scanner that finds vulnerabilities
Semgrep and CodeQL miss.

VulnHawk uses AI to understand your code's business logic - not just pattern matching.
It spots missing auth checks, IDOR flaws, and logic bugs that rule-based tools can't detect.


What Makes VulnHawk Different

Traditional scanners (Semgrep, CodeQL, Bandit) use pattern matching and AST rules. They're great at finding known patterns, but they can't understand intent.

VulnHawk analyzes your code with AI and cross-references how different parts of your codebase handle security. If 12 endpoints check authorization but one doesn't, VulnHawk catches it.

Feature Semgrep / CodeQL VulnHawk
Detection method AST pattern matching AI code understanding
Business logic bugs Cannot detect Detects missing auth, IDOR, logic flaws
Cross-file analysis Requires custom rules Automatic - compares similar code patterns
Setup Write rules, configure Zero config - works immediately
Finding descriptions Rule IDs and templates Natural language with attack scenarios
Fix suggestions Generic recommendations Context-specific code fixes

Quick Start

pip install vulnhawk

Set your LLM API key:

export ANTHROPIC_API_KEY=sk-ant-...    # Claude (default)
# or
export OPENAI_API_KEY=sk-...           # OpenAI
# or just run Ollama locally           # Free, no API key needed

Scan your code:

vulnhawk scan ./src

That's it. No config files, no rule writing, no setup.

Usage

Basic scan

vulnhawk scan ./src

Focused scanning

# Only check authentication and authorization
vulnhawk scan ./src --mode auth

# Only check for injection vulnerabilities
vulnhawk scan ./api --mode injection

# Only look for hardcoded secrets
vulnhawk scan . --mode secrets

Output formats

# JSON output
vulnhawk scan ./src -o json -f results.json

# SARIF for GitHub Code Scanning
vulnhawk scan ./src -o sarif -f results.sarif

# Markdown report
vulnhawk scan ./src -o markdown -f report.md

Different LLM backends

# Claude (default, best results)
vulnhawk scan ./src -b claude

# OpenAI
vulnhawk scan ./src -b openai -m gpt-4o

# Ollama (free, local, private)
vulnhawk scan ./src -b ollama -m llama3.1

Filter by severity

# Only critical and high
vulnhawk scan ./src --severity high

# Everything including info
vulnhawk scan ./src --severity info

Preview what will be scanned

vulnhawk info ./src

GitHub Action

Add VulnHawk to your CI/CD pipeline:

name: Security Scan
on: [pull_request]

permissions:
  security-events: write

jobs:
  vulnhawk:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: momenbasel/vulnhawk@main
        with:
          target: '.'
          api-key: ${{ secrets.ANTHROPIC_API_KEY }}
          severity: 'medium'
          fail-on-findings: 'true'

Results automatically appear in GitHub's Security tab via SARIF upload.

Scan Modes

Mode What it checks
full Everything (default)
auth Authentication bypass, missing auth checks, session flaws, JWT issues
injection SQLi, command injection, SSTI, NoSQL injection, XSS
secrets Hardcoded API keys, passwords, tokens, connection strings
config Debug mode, verbose errors, permissive CORS, insecure cookies
crypto Weak hashing, hardcoded keys, insecure random, deprecated algorithms

Supported Languages

  • Python
  • JavaScript / TypeScript
  • Go
  • More coming soon (Java, Ruby, PHP, Rust)

How It Works

  1. Discover - Walks your codebase, respects .gitignore and .vulnhawkignore
  2. Chunk - Splits code into logical pieces (functions, classes, routes) with surrounding context
  3. Enrich - For each chunk, includes import context and related code from elsewhere in the codebase (this is the key differentiator - it shows the AI how other parts handle auth, validation, etc.)
  4. Analyze - Sends enriched chunks to the LLM with security-focused analysis prompts
  5. Validate - Cross-references findings, removes duplicates, assigns confidence scores
  6. Report - Formats results with code snippets, attack scenarios, and fix suggestions

The enrichment step is what makes VulnHawk fundamentally different. By showing the AI how similar endpoints in your codebase handle security, it can spot the one that doesn't.

Configuration

.vulnhawkignore

Create a .vulnhawkignore file to exclude paths (same syntax as .gitignore):

# Skip generated code
generated/
*.gen.go

# Skip vendor dependencies
vendor/
third_party/

Environment Variables

Variable Description
ANTHROPIC_API_KEY API key for Claude backend
OPENAI_API_KEY API key for OpenAI backend

FAQ

How much does it cost to run? Depends on codebase size and LLM backend. A typical scan of a medium project (~100 files) costs about $0.50-$2.00 with Claude. Use Ollama for free local scanning.

Will it find everything? No security tool catches everything. VulnHawk is best at finding business logic bugs, missing authorization, and context-dependent vulnerabilities that pattern-matching tools miss. Use it alongside (not instead of) Semgrep/CodeQL.

Is my code sent to an external API? Yes, code chunks are sent to the configured LLM provider (Anthropic, OpenAI). Use the Ollama backend for fully local, private scanning.

Does it support monorepos? Yes. Point it at any directory and it will scan all supported files recursively.

Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines.

# Development setup
git clone https://github.com/momenbasel/vulnhawk.git
cd vulnhawk
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"
pytest

License

MIT - see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vulnhawk-0.1.0.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vulnhawk-0.1.0-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file vulnhawk-0.1.0.tar.gz.

File metadata

  • Download URL: vulnhawk-0.1.0.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for vulnhawk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 84c6f5be0f8032149cdcb5e7c5cd32ca26006c4afc14731f46b1f6dea3123af4
MD5 27f2593fe930e15697fdd2f352ec3ded
BLAKE2b-256 35908323fb601745f1a243ca84f5ff1db7fe68c164f397c08bc9a0bb4761657f

See more details on using hashes here.

File details

Details for the file vulnhawk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vulnhawk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for vulnhawk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 051350a7026307c6076936893d9ab371104ed69b8560eec9cbc1ca2f6cbbe8ee
MD5 6969a366d9ec6c7b9e5108a4003f619e
BLAKE2b-256 96c80c33131da26dc79e399d136681a9c51943446975e2cddb8f201f070f9c83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page