A comprehensive security scanner and RAG-based vulnerability analyzer
Project description
CyberSec Scanner
A comprehensive, modular security scanning toolkit for detecting secrets, vulnerabilities, and misconfigurations in Git repositories, web applications, and browser extensions. Features multi-scanner architecture, RAG-powered analysis, and both SDK and CLI interfaces.
Use Responsibly: This tool is for authorized security testing only. Always obtain proper permission before scanning applications you don't own.
Table of Contents
- Features
- Architecture
- Installation
- Quick Start
- MITM Proxy Setup
- Configuration
- Usage
- Scanner Modules
- Output Format
- Advanced Usage
- Troubleshooting
Features
Multi-Scanner Architecture
- Git Scanner: Detect secrets in commit history using efficient pickaxe search
- Web Crawler: Discover exposed endpoints, analyze JavaScript files and source maps
- Browser Scanner: Inspect localStorage, sessionStorage, cookies via Playwright
- Network Scanner: Real-time HTTPS traffic inspection with MITM proxy
RAG-Powered Analysis
- Knowledge Graph: NetworkX-based relationship mapping between findings, files, and vulnerabilities
- Semantic Search: Vector-based retrieval for similar security patterns
- LLM Integration: Natural language queries powered by Ollama (Gemma, Llama, etc.)
- CWE Enrichment: Automatic mapping to Common Weakness Enumeration
Detection Coverage
- 58+ Built-in Patterns: AWS, OpenAI, Stripe, GitHub, Azure, Google Cloud, databases, and more
- Entropy Analysis: High-entropy string detection for unknown secrets
- Custom Patterns: Extensible regex-based pattern system via
patterns.env - Contextual Severity: Smart severity assignment based on exposure context
Flexible Usage
- CLI Application: Full-featured command-line interface with 7 commands
- Python SDK: Use scanners independently or together in your own code
- YAML Configuration: Simple config files replace long CLI arguments
- Modular Design: Import only what you need, lazy loading for optional dependencies
Installation
From PyPI (Recommended)
pip install cybersec-scanner
From Source
git clone https://github.com/AnubhavChoudhery/cybersec-scanner.git
cd cybersec-scanner
pip install -e .
Optional Dependencies
# For MITM proxy (HTTPS traffic inspection)
pip install cybersec-scanner[mitm]
# For browser runtime inspection (Playwright)
pip install cybersec-scanner[browser]
# For vector search (RAG features)
pip install cybersec-scanner[vector]
# Install everything
pip install cybersec-scanner[all]
# For development
pip install cybersec-scanner[dev]
Note: The base installation includes Git scanning, web crawling, and RAG analysis. MITM and browser features require additional dependencies.
System Requirements
- Python 3.11 or higher
- Git (for git history scanning)
- mitmproxy 10.0+ (for HTTPS inspection - optional)
- Playwright (for browser inspection - optional)
Quick Start
Prerequisites for RAG Queries
To use the query command with LLM-powered analysis, you need Ollama:
# Install Ollama (https://ollama.com)
# Linux/Mac:
curl -fsSL https://ollama.com/install.sh | sh
# Windows: Download from https://ollama.com
# Pull the default model
ollama pull gemma3:1b
# Start Ollama (keep running in background)
ollama serve
Complete Scan-to-Query Workflow
# 1. Install the scanner
pip install cybersec-scanner
# 2. Download patterns file (required for detection)
curl -o patterns.env https://raw.githubusercontent.com/AnubhavChoudhery/cybersec-scanner/main/patterns.env
# 3. Scan your project (Git history)
cybersec-scanner scan --git --root . --output audit_report.json --enable-rag
# 4. Query the findings
cybersec-scanner query "What secrets were found?" --audit audit_report.json
# 5. Save response to file
cybersec-scanner query "Summarize critical findings" --output summary.txt
CLI Usage
# Show all available commands
cybersec-scanner --help
# Initialize configuration file
cybersec-scanner init-config
# Scan a Git repository
cybersec-scanner scan-git /path/to/repo --max-commits 50
# Scan a web application
cybersec-scanner scan-web http://localhost:8000 --max-pages 100
# Scan MITM traffic logs
cybersec-scanner scan-mitm mitm_traffic.ndjson
# Full multi-scanner workflow
cybersec-scanner scan \
--git \
--web \
--mitm \
--runtime \
--root . \
--target http://localhost:8000 \
--max-commits 50 \
--mitm-traffic mitm_traffic.ndjson \
--output audit_report.json \
--enable-rag
# Query findings with RAG
cybersec-scanner query "What API keys were found?" --audit audit_report.json
# Build knowledge graph from existing report
cybersec-scanner build-graph audit_report.json
# Check version
cybersec-scanner version
MITM Proxy Workflow
The scanner provides interactive MITM inspection that captures real HTTP/HTTPS traffic. Traffic files are automatically shared between scanner and backend via a temp directory.
1. Add MITM injection to your backend (one-time setup):
# backend/app/main.py - MUST BE FIRST IMPORT
from cybersec_scanner.scanners.inject_mitm_proxy import inject_mitm_proxy_advanced
# No path needed - uses shared temp location automatically
inject_mitm_proxy_advanced()
# Now import your framework
from fastapi import FastAPI
# ... rest of your code
2. Run the scanner with MITM enabled:
# No --mitm-traffic flag needed - auto-discovers shared file
cybersec-scanner scan --mitm --output audit_report.json
3. Start your backend when prompted:
The scanner will start the MITM proxy and wait for you to start your backend and exercise the app.
# In another terminal
uvicorn backend.app.main:app --reload
4. Test your application - make requests, test endpoints
5. Press Ctrl+C in the scanner terminal when done
The scanner will parse all captured traffic and generate the audit report.
Traffic File Location:
- Windows:
C:\Users\<user>\AppData\Local\Temp\cybersec_scanner\mitm_traffic.ndjson - Linux/Mac:
/tmp/cybersec_scanner/mitm_traffic.ndjson
Advanced: You can override with --mitm-traffic /custom/path.ndjson if needed.
Python SDK Usage
from cybersec_scanner import scan_git, scan_web, scan_all
# Scan a Git repository
findings = scan_git("/path/to/repo", max_commits=100)
print(f"Found {len(findings)} secrets in Git history")
# Scan a web application
web_findings = scan_web("http://localhost:8000", max_pages=300)
# Full scan with custom config
config = {
"git": {
"enabled": True,
"repositories": ["/path/to/repo"],
"max_commits": 100
},
"web": {
"enabled": True,
"target": "http://localhost:8000",
"max_pages": 300
},
"output": {
"file": "security_report.json"
}
}
results = scan_all(config)
CLI Command Reference
Available Commands
| Command | Description |
|---|---|
scan |
Run comprehensive scan with multiple scanners |
scan-git |
Scan Git repository for committed secrets |
scan-web |
Scan web application endpoints |
scan-mitm |
Parse MITM traffic logs |
query |
Query findings using RAG/LLM |
build-graph |
Build knowledge graph from audit report |
init-config |
Create default YAML configuration |
version |
Show version information |
install-cert |
Install mitmproxy CA certificate |
start-proxy |
Start MITM proxy daemon |
Scan Command Options
cybersec-scanner scan [OPTIONS]
Scanner Flags:
--git- Enable Git history scanner--web- Enable web application scanner--mitm- Enable MITM traffic analysis--runtime- Enable browser runtime inspector (Playwright)
Scanner Configuration:
--root PATH- Root directory for Git scan (default:.)--target URL- Target URL for web scan--max-commits N- Maximum Git commits to scan (default: 50)--mitm-traffic PATH- Path to MITM traffic NDJSON file
Output Options:
--output PATH,-o PATH- Output audit report file (default:audit_report.json)--config PATH,-c PATH- Load settings from YAML config file--enable-rag- Build knowledge graph after scan for RAG queries
Example - Full Scan:
cybersec-scanner scan \
--git \
--web \
--mitm \
--root ~/myproject \
--target http://localhost:8000 \
--max-commits 100 \
--mitm-traffic mitm_traffic.ndjson \
--output security_audit.json \
--enable-rag
Query Command
cybersec-scanner query "your question" [OPTIONS]
Options:
--audit PATH- Audit report to build graph from (if graph doesn't exist)--graph PATH- Existing knowledge graph file (default:rag/graph.gpickle)--model NAME- Ollama model to use (default:gemma3:1b)--top-k N- Number of findings to retrieve (default: 5)--output PATH,-o PATH- Save LLM response to file
Example:
# Query with existing graph
cybersec-scanner query "What AWS credentials were found?"
# Build graph and query
cybersec-scanner query "List all high severity findings" --audit audit_report.json
# Use different model
cybersec-scanner query "Explain the security risks" --model llama3:8b
# Save response to file
cybersec-scanner query "Summarize critical findings" --output security_summary.txt
Individual Scanner Commands
Git Scanner:
cybersec-scanner scan-git [REPO_PATH] [OPTIONS]
# Options:
# --max-commits N Max commits to scan (default: 50)
# --output PATH Output JSON file
# Example:
cybersec-scanner scan-git . --max-commits 100 --output git_findings.json
Web Scanner:
cybersec-scanner scan-web URL [OPTIONS]
# Options:
# --max-pages N Max pages to crawl (default: 50)
# --output PATH Output JSON file
# Example:
cybersec-scanner scan-web http://localhost:3000 --max-pages 200
MITM Scanner:
cybersec-scanner scan-mitm TRAFFIC_FILE [OPTIONS]
# Options:
# --output PATH Output JSON file
# Example:
cybersec-scanner scan-mitm mitm_traffic.ndjson --output mitm_findings.json
Configuration File
Create a YAML config to avoid long command lines:
cybersec-scanner init-config --output my-config.yaml
Example my-config.yaml:
scanner:
git:
enabled: true
root: "."
max_commits: 100
web:
enabled: true
target: "http://localhost:8000"
max_pages: 300
mitm:
enabled: true
traffic_file: "mitm_traffic.ndjson"
runtime:
enabled: false
output:
file: "audit_report.json"
rag:
enabled: true
model: "gemma3:1b"
Usage:
cybersec-scanner scan --config my-config.yaml
Utility Commands
Build Knowledge Graph:
cybersec-scanner build-graph AUDIT_FILE [OPTIONS]
# Options:
# --output PATH, -o Output graph file (default: rag/graph.gpickle)
# Example:
cybersec-scanner build-graph audit_report.json --output my_graph.gpickle
Initialize Config File:
cybersec-scanner init-config [OPTIONS]
# Options:
# --output PATH, -o Output config file path (default: cybersec-config.yaml)
# Example:
cybersec-scanner init-config --output my-config.yaml
Show Version:
cybersec-scanner version
Install MITM Certificate:
cybersec-scanner install-cert [OPTIONS]
# Options:
# --port PORT MITM proxy port (informational, default: 8082)
# --no-download Skip HTTP download, use local cert only
# Example:
cybersec-scanner install-cert --port 8082
Start MITM Proxy:
cybersec-scanner start-proxy [OPTIONS]
# Options:
# --port PORT Proxy listen port (default: 8082)
# --traffic-file PATH Traffic log file path (default: temp dir auto-shared)
# Example:
cybersec-scanner start-proxy --port 9000 --traffic-file ./my_traffic.ndjson
📋 Required Files
IMPORTANT: Before running scans, you need these files adjacent to your working directory (where you run the scanner):
1. patterns.env (REQUIRED)
This file contains regex patterns for detecting secrets. Copy it from the repository root:
# If you installed from source
cp patterns.env /your/project/directory/
# If you installed from PyPI, download from GitHub
curl -o patterns.env https://raw.githubusercontent.com/AnubhavChoudhery/cybersec-scanner/main/patterns.env
The file includes 58+ detection patterns for major providers:
AWS_ACCESS_KEY_ID=AKIA[0-9A-Z]{16}
OPENAI_API_KEY=sk-[a-zA-Z0-9]{20,}
STRIPE_SECRET_KEY=sk_live_[0-9a-zA-Z]{24,}
GITHUB_TOKEN=ghp_[0-9a-zA-Z]{36}
# ... and 54 more patterns
Security Note: This file is excluded from git by default to avoid triggering security scanners. Never commit actual secrets to this file!
Chrome_Ext/
├── local_check.py # Main orchestrator
├── config.py # Configuration and patterns
├── utils.py # Utility functions
├── patterns.env # Secret detection patterns (user-configured)
├── inject_mitm_proxy.py # MITM proxy injection module
├── install_mitm_cert.py # Certificate installation helper
├── scanners/
│ ├── git_scanner.py # Git history analysis
│ ├── web_crawler.py # HTTP endpoint scanning
│ ├── browser_scanner.py # Playwright runtime inspection
│ └── network_scanner.py # MITM proxy traffic analysis
└── audit_report.json # Output report (generated)
Installation
System Requirements
- Python 3.8 or higher
- Git (for git history scanning)
- mitmproxy 10.0+ (for HTTPS inspection)
- Modern web browser (for Playwright scanner)
Required Dependencies
pip install -r requirements.txt
If requirements.txt is not available, install manually:
pip install requests colorama
Optional Dependencies
For HTTPS Traffic Inspection
# Install mitmproxy
pip install mitmproxy
# Verify installation
mitmdump --version
For Browser Runtime Inspection
pip install playwright
python -m playwright install
For Network Packet Capture (Advanced)
pip install scapy
# Windows: Install Npcap from https://npcap.com/
# Linux/Mac: May require libpcap
Quick Start
Initial Setup
-
Clone or download the repository
-
Set up pattern file (REQUIRED before first run)
# Copy the patterns file template
cp patterns.env.example patterns.env
# The file includes 58+ detection patterns for major providers
# Edit patterns.env to customize or add patterns (optional)
- Verify setup
python -c "from config import KNOWN_PATTERNS; print(f'Loaded {len(KNOWN_PATTERNS)} patterns')"
Expected output: Loaded 58 patterns (or similar)
Basic Usage
# Scan with default settings
python local_check.py --target http://localhost:8000 --root /path/to/project
# Generate audit report
cat audit_report.json
MITM Proxy Setup
The MITM (Man-in-the-Middle) proxy feature allows inspection of HTTPS traffic in real-time, including request/response headers and bodies.
Prerequisites
- Install mitmproxy
pip install mitmproxy
# Verify installation
mitmdump --version
- Copy required files to your backend
# From the Chrome_Ext directory
cp inject_mitm_proxy.py /path/to/your/backend/app/
cp patterns.env /path/to/your/backend/app/
Backend Integration
CRITICAL: Add the import statement as the VERY FIRST LINE of your main application file. This is not optional - the import MUST come before any other imports (Flask, FastAPI, Django, etc.) for the MITM proxy to properly intercept HTTP libraries.
The inject_mitm_proxy module automatically:
- Starts a proxy server on port 8082 (configurable via
MITM_PROXY_PORT) - Patches HTTP libraries (requests, httpx, urllib, urllib3, aiohttp) to route through proxy
- Inspects all outbound HTTP/HTTPS traffic for security issues
- Logs traffic to
mitm_traffic.ndjsonin the same directory - Bypasses specific domains (AWS, OAuth, AI providers) to prevent authentication issues
For FastAPI:
# backend/app/main.py
import inject_mitm_proxy # MUST BE FIRST IMPORT (before FastAPI, before everything!)
from fastapi import FastAPI # This comes AFTER inject_mitm_proxy
from fastapi.middleware.cors import CORSMiddleware
# ... rest of your imports
app = FastAPI()
# ... rest of your code
For Flask:
# backend/app.py
import inject_mitm_proxy # MUST BE FIRST IMPORT (before Flask, before everything!)
from flask import Flask # This comes AFTER inject_mitm_proxy
from flask_cors import CORS
# ... rest of your imports
app = Flask(__name__)
# ... rest of your code
For Django:
# backend/manage.py or wsgi.py
import inject_mitm_proxy # MUST BE FIRST IMPORT (before Django, before everything!)
import os # This comes AFTER inject_mitm_proxy
from django.core.wsgi import get_wsgi_application
# ... rest of Django setup
Why FIRST import matters: The module patches HTTP libraries at import time. If Flask/FastAPI/Django import first, their HTTP clients won't be patched, and traffic won't be intercepted.
Running with MITM Proxy
- Start your backend application
# No environment variables needed - proxy is always enabled
# Just start your backend normally
uvicorn app.main:app --reload # FastAPI example
You should see:
[MITM] Proxy active on http://127.0.0.1:8082
[MITM] Bypass mode: AWS, OAuth, AI providers, payments, CDNs
[MITM] Patched libraries: requests, httpx, urllib, urllib3, aiohttp
- Run the security scanner
# In a new terminal, run the scanner with MITM enabled
python local_check.py \
--target http://localhost:8000 \
--enable-mitm \
--mitm-port 8082
-
Interact with your application (make HTTP requests, use API endpoints, etc.)
-
Stop the scanner (Ctrl+C) to generate the audit report
-
Review results
# View audit report
cat audit_report.json
# View traffic log (raw NDJSON)
cat mitm_traffic.ndjson
MITM Proxy Detection Capabilities
The MITM proxy inspects both requests and responses for security issues:
Request-Side Detection:
- Credentials embedded in URLs (
user:pass@domain) - API keys in query parameters (
?api_key=xxx) - Basic Authentication headers (base64 credentials)
- API keys in Authorization headers (with context awareness)
- Plaintext passwords in request bodies (excludes bcrypt/argon2 hashes)
- Secrets matching any of the 58+ patterns
Response-Side Detection:
- Secrets leaked in response headers
- API keys in response bodies (JSON, HTML, JavaScript)
- Credentials in error messages
- Database connection strings in stack traces
- Debug information containing sensitive data
Severity Levels:
CRITICAL: API keys in URLs, credentials over HTTP, plaintext passwordsHIGH: API keys in headers over HTTPS (with expected auth disclaimer)INFO: Normal traffic logging (not a security issue)
MITM Proxy Configuration
The inject_mitm_proxy.py module works automatically when imported. The only optional configuration is:
# Set custom MITM proxy port (default: 8082)
export MITM_PROXY_PORT=9000
No other environment variables needed - the proxy runs in full mode by default with intelligent domain bypass.
Domain Bypass Configuration
By default, the following domains bypass the MITM proxy to prevent authentication and SSL issues:
OAuth Providers:
accounts.google.com,oauth2.googleapis.com,login.microsoftonline.com
AI Providers:
api.openai.com,openai.comapi.anthropic.com,anthropic.comapi.groq.com,groq.comapi.mistral.ai,mistral.aiapi-inference.huggingface.co,huggingface.coapi.cohere.ai,replicate.com,together.xyz,anyscale.com,perplexity.ai
AWS Services:
- All
*.amazonaws.comdomains - API Gateway, Lambda, S3, CloudFront
Payment Providers:
stripe.com,paypal.com
CDNs:
cloudflare.com,cloudfront.net
Localhost:
127.0.0.1,localhost
To modify bypass rules, edit the BYPASS_DOMAINS and AWS_SUFFIXES sets in inject_mitm_proxy.py.
Uninstalling MITM Proxy
To remove MITM proxy from your backend:
- Remove or comment out the import:
# import inject_mitm_proxy # Disabled
- Restart your backend application
The proxy is only active when the module is imported.
Configuration
Pattern File (patterns.env)
The patterns.env file contains regular expressions for detecting secrets. This file is excluded from version control to prevent triggering GitHub security alerts.
Format:
PATTERN_NAME=regex_pattern
Adding custom patterns:
# Edit patterns.env
nano patterns.env
# Add your pattern
MY_CUSTOM_KEY=mykey_[0-9a-f]{32}
# Reload the scanner
python local_check.py --target http://localhost:8000
Configuration File (config.py)
Entropy Threshold:
ENTROPY_THRESHOLD = 3.5 # Shannon entropy for randomness detection
File Exclusions:
EXCLUDE_SUFFIXES = {
'.png', '.jpg', '.jpeg', '.gif', '.bmp', '.ico',
'.zip', '.tar', '.gz', '.pdf', '.exe', '.dll'
}
Probe Paths (for web crawler):
PROBE_PATHS = [
'/.env', '/.env.local', '/.env.production',
'/.git/config', '/.git/HEAD',
'/config.php.bak', '/backup.sql'
]
Usage
Command-Line Options
python local_check.py [OPTIONS]
Core Options:
| Option | Type | Default | Description |
|---|---|---|---|
--target, -t |
URL | http://localhost:8000 |
Target application URL |
--root, -r |
Path | . |
Repository root for static analysis |
--out, -o |
Path | audit_report.json |
Output report filename |
Scanner Options:
| Option | Type | Default | Description |
|---|---|---|---|
--depth |
Integer | 300 |
Maximum pages to crawl |
--enable-playwright |
Flag | False |
Enable browser runtime inspection |
--enable-pcap |
Flag | False |
Enable packet capture (requires root) |
--pcap-timeout |
Integer | 12 |
Packet capture duration (seconds) |
MITM Proxy Options:
| Option | Type | Default | Description |
|---|---|---|---|
--enable-mitm |
Flag | False |
Enable MITM proxy for HTTPS inspection |
--mitm-port |
Integer | 8082 |
MITM proxy port |
--mitm-duration |
Integer | 0 |
Auto-stop after N seconds (0 = manual) |
--mitm-traffic |
Path | Auto-detect | Custom path to traffic NDJSON file |
Usage Examples
Basic scan:
python local_check.py --target http://localhost:8000 --root /path/to/project
Full scan with all features:
python local_check.py \
--target http://localhost:3000 \
--root ~/myapp \
--enable-playwright \
--enable-mitm \
--depth 500 \
--out security_report.json
MITM-only scan (skip static/git):
python local_check.py \
--target http://localhost:8000 \
--enable-mitm \
--mitm-duration 30
Custom traffic log location:
python local_check.py \
--target http://localhost:8000 \
--enable-mitm \
--mitm-traffic /custom/path/to/traffic.ndjson
Scanner Modules
1. Git Scanner (scanners/git_scanner.py)
Analyzes git commit history for leaked secrets using efficient pickaxe search.
Features:
- Searches git history for known secret patterns
- Uses
git log -S<term>for 100x faster scanning than naive approaches - Examines up to 100 commits by default (configurable)
- Scans added lines in diffs for pattern matches
Configuration:
scan_git_history(root, max_commits=100)
2. Web Crawler (scanners/web_crawler.py)
Crawls web application endpoints to discover exposed sensitive paths and analyze client-side code.
Features:
- Discovers exposed
.env,.git/config, backup files - Analyzes JavaScript files for hardcoded secrets
- Extracts and scans source maps
- Checks HTTP headers and cookies for leaked secrets
- Detects catch-all responses (false positives)
- Multi-threaded crawling with process pool for regex scanning
Configuration:
crawler = LocalCrawler(
base="http://localhost:8000",
timeout=6,
max_pages=300,
workers=8,
max_js_size=500_000 # Skip large JS bundles
)
3. Browser Scanner (scanners/browser_scanner.py)
Uses Playwright to inspect browser runtime state and client-side storage.
Features:
- Extracts localStorage contents
- Extracts sessionStorage contents
- Retrieves all cookies
- Checks global variables (
window.__ENV,window.config,window.API_KEY)
Requirements:
pip install playwright
python -m playwright install
Usage:
playwright_inspect("http://localhost:8000")
4. Network Scanner (scanners/network_scanner.py)
Runs mitmproxy addon for deep packet inspection (Layer 2).
Features:
- Intercepts HTTP/HTTPS traffic at the proxy level
- Pattern matching on request/response bodies
- Security header validation
- Works alongside
inject_mitm_proxy.py(Layer 1)
Note: Most users will use inject_mitm_proxy.py for MITM inspection. This module provides additional addon-based analysis.
Output Format
Audit Report (audit_report.json)
{
"timestamp": "2025-11-18T13:34:34.106644",
"target": "http://localhost:8000",
"stats": {
"git_secrets": 0,
"crawler_issues": 2,
"browser_issues": 0,
"mitm_proxied": 15,
"mitm_bypassed": 3,
"mitm_security_findings": 1
},
"severities": {
"CRITICAL": 0,
"HIGH": 1,
"MEDIUM": 0,
"LOW": 0,
"INFO": 15
},
"findings": [
{
"type": "api_key_in_header",
"severity": "HIGH",
"timestamp": 1763494461,
"timestamp_human": "2025-11-18 13:34:21",
"description": "GROQ_API_KEY in Authorization header over HTTPS (expected for server-side API calls, review if unexpected)",
"url": "https://api.groq.com/openai/v1/chat/completions",
"client": "requests",
"method": "post",
"pattern": "GROQ_API_KEY",
"header": "Authorization"
}
]
}
Traffic Log (mitm_traffic.ndjson)
NDJSON (newline-delimited JSON) format for append-only logging:
{"ts": 1763494398, "timestamp": "2025-11-18 13:33:18", "stage": "mitm_outbound", "client": "requests", "method": "post", "url": "https://api.example.com/endpoint"}
{"ts": 1763494461, "timestamp": "2025-11-18 13:34:21", "stage": "security_finding", "severity": "HIGH", "type": "api_key_in_header", "pattern": "GROQ_API_KEY", "description": "...", "url": "...", "client": "requests", "method": "post", "header": "Authorization"}
Stages:
mitm_outbound: Request sent through proxymitm_bypass: Request bypassed proxy (OAuth, AWS, etc.)security_finding: Security issue detected
Advanced Usage
Custom Pattern Detection
Create a custom pattern file:
# Create custom-patterns.env
cat > custom-patterns.env << EOF
CUSTOM_API_KEY=custom_[0-9a-f]{32}
INTERNAL_TOKEN=int_tok_[A-Za-z0-9]{24}
EOF
# Edit config.py to load from custom file
# (Modify PATTERNS_FILE path in config.py)
Integrating with CI/CD
# .github/workflows/security-scan.yml
name: Security Audit
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- run: pip install -r requirements.txt
- run: cp patterns.env.example patterns.env
- run: python local_check.py --target http://localhost:8000 --root .
- run: |
if jq -e '.severities.CRITICAL > 0' audit_report.json; then
echo "CRITICAL issues found!"
exit 1
fi
Programmatic Usage
from scanners import scan_git_history, LocalCrawler, playwright_inspect
# Git scanning
git_findings = scan_git_history("/path/to/repo", max_commits=100)
# Web crawling
crawler = LocalCrawler("http://localhost:8000", max_pages=200)
crawler.probe_common_paths()
crawler.crawl()
web_findings = crawler.findings
# Browser inspection
browser_data = playwright_inspect("http://localhost:8000")
# Combine results
all_findings = git_findings + web_findings
Troubleshooting
"No module named 'requests'"
pip install requests
"patterns.env not found"
cp patterns.env.example patterns.env
"playwright-not-installed"
pip install playwright
python -m playwright install
"MITM proxy not loading patterns"
Issue: Backend shows WARNING: patterns.env not found
Solution:
# Verify patterns.env is in the same directory as inject_mitm_proxy.py
ls -la /path/to/backend/app/patterns.env
# If missing, copy it
cp patterns.env /path/to/backend/app/
"MITM proxy not intercepting traffic"
Issue: No traffic logged in mitm_traffic.ndjson
Solutions:
- Verify import is present and FIRST:
import inject_mitm_proxy # MUST BE FIRST
# ... other imports
python app.py
# Should see: "[MITM] Proxy active on http://127.0.0.1:8082"
- Check proxy port matches:
# Scanner
python local_check.py --enable-mitm --mitm-port 8082
# Backend
export MITM_PROXY_PORT=8082
"Permission denied during packet capture"
# Linux/Mac
sudo python local_check.py --enable-pcap
# Windows
# Run terminal as Administrator
"Git scan is very slow"
This is normal for large repositories (100k+ commits). The tool limits to 100 commits by default. To adjust:
# Modify scanners/git_scanner.py
scan_git_history(root, max_commits=50) # Reduce commit limit
"Too many false positives"
- Adjust entropy threshold in
config.py:
ENTROPY_THRESHOLD = 4.0 # Higher = fewer false positives
- Add exclusions for known patterns:
# In config.py
EXCLUDE_PATTERNS = [
r'test_api_key_123', # Test keys
r'example\.com', # Example domains
]
- Filter by severity in audit report:
# Only show CRITICAL issues
jq '.findings[] | select(.severity == "CRITICAL")' audit_report.json
Security Considerations
Testing Your Own Applications Only
This tool is designed for security testing of applications you own or have explicit permission to test. Unauthorized scanning may violate laws and terms of service.
MITM Proxy Security
The MITM proxy disables SSL verification for testing purposes. This should only be used in development/testing environments, never in production.
Do NOT:
- Use MITM proxy in production environments
- Commit
inject_mitm_proxy.pyimport to production code - Share MITM proxy logs (may contain sensitive data)
Best Practices:
- Use environment variables to control MITM activation
- Keep
mitm_traffic.ndjsonandaudit_report.jsonout of version control (add to.gitignore) - Review and sanitize audit reports before sharing
Pattern File Security
The patterns.env file is excluded from version control by default (.gitignore) to avoid triggering GitHub security alerts on pattern signatures.
Do NOT:
- Commit
patterns.envto public repositories - Include actual secret values in pattern files
- Share pattern files with untrusted parties
Version History
| Version | Changes |
|---|---|
| 1.0.5 | Enhanced scan output with phase headers, fixed query output to be human-readable (extracts text from response) |
| 1.0.4 | Colored CLI output with colorama, --output flag for query command, simplified MITM traffic auto-sharing |
| 1.0.3 | Unified MITM traffic file location via temp directory, added colorama dependency |
| 1.0.2 | Fixed MITM traffic file path resolution bug |
| 1.0.1 | Initial PyPI release with full scanner suite |
License
MIT License - See LICENSE file for details.
Contributing
Contributions are welcome! Please follow these guidelines:
- Test your changes with multiple target applications
- Update documentation for new features
- Follow existing code style and structure
- Add tests for new scanner modules
- Ensure no secrets are committed in test files
Disclaimer
This tool is provided for lawful security testing only. Users are responsible for ensuring they have proper authorization before scanning any application. The authors assume no liability for misuse or unauthorized access.
Testing
Quick Test Commands
# Run all tests (auto-detects Ollama)
python run_tests.py
# Run all tests including LLM (requires Ollama)
python run_tests.py --all
# Fast tests only (no LLM)
python run_tests.py --fast
# With coverage report
python run_tests.py --coverage
# Specific test file
python run_tests.py --file retriever
Test Prerequisites
Core tests (no additional setup):
pip install pytest pytest-cov
pytest tests/ -v -k "not llm_client"
LLM tests (requires Ollama):
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh # Linux/Mac
# Or download from https://ollama.com for Windows
# Pull model
ollama pull gemma3:1b
# Run all tests
pytest tests/ -v
Test Coverage
| Component | Tests | Coverage |
|---|---|---|
| Knowledge Graph | PASS 1 test | 100% |
| CWE Enrichment | PASS 1 test | 100% |
| Database Normalizer | PASS 5 tests | 95% |
| Graph Retriever | PASS 8 tests | 100% |
| LLM Client | PASS 8 tests | 85% |
| End-to-End Pipeline | PASS 2 tests | Full flow |
| Total | 24 tests | ~90% |
See tests/README.md for detailed testing documentation.
Support
For issues, questions, or contributions:
- Open an issue on GitHub
- Review existing issues before creating new ones
- Provide detailed information (OS, Python version, error messages, steps to reproduce)
Made by the JBAC EdtEch Team (Jai Ansh Bindra and Anubhav Choudhery)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cybersec_scanner-1.0.5.tar.gz.
File metadata
- Download URL: cybersec_scanner-1.0.5.tar.gz
- Upload date:
- Size: 157.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec90686e96154a12fa95ffb83eb09c6c0b8a5d44b0a5a50597fe6280c34405c6
|
|
| MD5 |
685d074b724b9442c6084f3d6290a2c2
|
|
| BLAKE2b-256 |
a3bc44b6e4186feecae0ea32011aa0a33e347e650cdf9cdc83b2c36060532b83
|
File details
Details for the file cybersec_scanner-1.0.5-py3-none-any.whl.
File metadata
- Download URL: cybersec_scanner-1.0.5-py3-none-any.whl
- Upload date:
- Size: 73.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5798f160cae57eb76e6e39caaf8c585c76dabe8294ea9397dc806e5150dfe37
|
|
| MD5 |
978a1c78d889835ea63d4345b0a2db20
|
|
| BLAKE2b-256 |
0b2c82c4bebee92d1e887151e97a6b9b44f5e11adf8310d4bc14970ba8832632
|