A comprehensive security scanner and RAG-based vulnerability analyzer

These details have not been verified by PyPI

Project links

Project description

CyberSec Scanner

A comprehensive, modular security scanning toolkit for detecting secrets, vulnerabilities, and misconfigurations in Git repositories, web applications, and browser extensions. Features multi-scanner architecture, RAG-powered analysis, and both SDK and CLI interfaces.

Use Responsibly: This tool is for authorized security testing only. Always obtain proper permission before scanning applications you don't own.

Features
Architecture
Installation
Quick Start
MITM Proxy Setup
Configuration
Usage
Scanner Modules
Output Format
Advanced Usage
Troubleshooting

Features

Multi-Scanner Architecture

Git Scanner: Detect secrets in commit history using efficient pickaxe search
Web Crawler: Discover exposed endpoints, analyze JavaScript files and source maps
Browser Scanner: Inspect localStorage, sessionStorage, cookies via Playwright
Network Scanner: Real-time HTTPS traffic inspection with MITM proxy

RAG-Powered Analysis

Knowledge Graph: NetworkX-based relationship mapping between findings, files, and vulnerabilities
Semantic Search: Vector-based retrieval for similar security patterns
LLM Integration: Natural language queries powered by Ollama (Gemma, Llama, etc.)
CWE Enrichment: Automatic mapping to Common Weakness Enumeration

Detection Coverage

58+ Built-in Patterns: AWS, OpenAI, Stripe, GitHub, Azure, Google Cloud, databases, and more
Entropy Analysis: High-entropy string detection for unknown secrets
Custom Patterns: Extensible regex-based pattern system via patterns.env
Contextual Severity: Smart severity assignment based on exposure context

Flexible Usage

CLI Application: Full-featured command-line interface with 7 commands
Python SDK: Use scanners independently or together in your own code
YAML Configuration: Simple config files replace long CLI arguments
Modular Design: Import only what you need, lazy loading for optional dependencies

Installation

From PyPI (Recommended)

pip install cybersec-scanner

From Source

git clone https://github.com/AnubhavChoudhery/cybersec-scanner.git
cd cybersec-scanner
pip install -e .

Optional Dependencies

# For MITM proxy (HTTPS traffic inspection)
pip install cybersec-scanner[mitm]

# For browser runtime inspection (Playwright)
pip install cybersec-scanner[browser]

# For vector search (RAG features)
pip install cybersec-scanner[vector]

# Install everything
pip install cybersec-scanner[all]

# For development
pip install cybersec-scanner[dev]

Note: The base installation includes Git scanning, web crawling, and RAG analysis. MITM and browser features require additional dependencies.

System Requirements

Python 3.11 or higher
Git (for git history scanning)
mitmproxy 10.0+ (for HTTPS inspection - optional)
Playwright (for browser inspection - optional)

Quick Start

Prerequisites for RAG Queries

To use the query command with LLM-powered analysis, you need Ollama:

# Install Ollama (https://ollama.com)
# Linux/Mac:
curl -fsSL https://ollama.com/install.sh | sh
# Windows: Download from https://ollama.com

# Pull the default model
ollama pull gemma3:1b

# Start Ollama (keep running in background)
ollama serve

Complete Scan-to-Query Workflow

# 1. Install the scanner
pip install cybersec-scanner

# 2. Download patterns file (required for detection)
curl -o patterns.env https://raw.githubusercontent.com/AnubhavChoudhery/cybersec-scanner/main/patterns.env

# 3. Scan your project (Git history)
cybersec-scanner scan --git --root . --output audit_report.json --enable-rag

# 4. Query the findings
cybersec-scanner query "What secrets were found?" --audit audit_report.json

# 5. Save response to file
cybersec-scanner query "Summarize critical findings" --output summary.txt

CLI Usage

# Show all available commands
cybersec-scanner --help

# Initialize configuration file
cybersec-scanner init-config

# Scan a Git repository
cybersec-scanner scan-git /path/to/repo --max-commits 50

# Scan a web application
cybersec-scanner scan-web http://localhost:8000 --max-pages 100

# Scan MITM traffic logs
cybersec-scanner scan-mitm mitm_traffic.ndjson

# Full multi-scanner workflow
cybersec-scanner scan \
  --git \
  --web \
  --mitm \
  --runtime \
  --root . \
  --target http://localhost:8000 \
  --max-commits 50 \
  --mitm-traffic mitm_traffic.ndjson \
  --output audit_report.json \
  --enable-rag

# Query findings with RAG
cybersec-scanner query "What API keys were found?" --audit audit_report.json

# Build knowledge graph from existing report
cybersec-scanner build-graph audit_report.json

# Check version
cybersec-scanner version

MITM Proxy Workflow

The scanner provides interactive MITM inspection that captures real HTTP/HTTPS traffic. Traffic files are automatically shared between scanner and backend via a temp directory.

1. Add MITM injection to your backend (one-time setup):

# backend/app/main.py - MUST BE FIRST IMPORT
from cybersec_scanner.scanners.inject_mitm_proxy import inject_mitm_proxy_advanced

# No path needed - uses shared temp location automatically
inject_mitm_proxy_advanced()

# Now import your framework
from fastapi import FastAPI
# ... rest of your code

2. Run the scanner with MITM enabled:

# No --mitm-traffic flag needed - auto-discovers shared file
cybersec-scanner scan --mitm --output audit_report.json

3. Start your backend when prompted:

The scanner will start the MITM proxy and wait for you to start your backend and exercise the app.

# In another terminal
uvicorn backend.app.main:app --reload

4. Test your application - make requests, test endpoints

5. Press Ctrl+C in the scanner terminal when done

The scanner will parse all captured traffic and generate the audit report.

Traffic File Location:

Windows: C:\Users\<user>\AppData\Local\Temp\cybersec_scanner\mitm_traffic.ndjson
Linux/Mac: /tmp/cybersec_scanner/mitm_traffic.ndjson

Advanced: You can override with --mitm-traffic /custom/path.ndjson if needed.

Python SDK Usage

from cybersec_scanner import scan_git, scan_web, scan_all

# Scan a Git repository
findings = scan_git("/path/to/repo", max_commits=100)
print(f"Found {len(findings)} secrets in Git history")

# Scan a web application
web_findings = scan_web("http://localhost:8000", max_pages=300)

# Full scan with custom config
config = {
    "git": {
        "enabled": True,
        "repositories": ["/path/to/repo"],
        "max_commits": 100
    },
    "web": {
        "enabled": True,
        "target": "http://localhost:8000",
        "max_pages": 300
    },
    "output": {
        "file": "security_report.json"
    }
}

results = scan_all(config)

CLI Command Reference

Available Commands

Command	Description
`scan`	Run comprehensive scan with multiple scanners
`scan-git`	Scan Git repository for committed secrets
`scan-web`	Scan web application endpoints
`scan-mitm`	Parse MITM traffic logs
`query`	Query findings using RAG/LLM
`build-graph`	Build knowledge graph from audit report
`init-config`	Create default YAML configuration
`version`	Show version information
`install-cert`	Install mitmproxy CA certificate
`start-proxy`	Start MITM proxy daemon

Scan Command Options

cybersec-scanner scan [OPTIONS]

Scanner Flags:

--git - Enable Git history scanner
--web - Enable web application scanner
--mitm - Enable MITM traffic analysis
--runtime - Enable browser runtime inspector (Playwright)

Scanner Configuration:

--root PATH - Root directory for Git scan (default: .)
--target URL - Target URL for web scan
--max-commits N - Maximum Git commits to scan (default: 50)
--mitm-traffic PATH - Path to MITM traffic NDJSON file

Output Options:

--output PATH, -o PATH - Output audit report file (default: audit_report.json)
--config PATH, -c PATH - Load settings from YAML config file
--enable-rag - Build knowledge graph after scan for RAG queries

Example - Full Scan:

cybersec-scanner scan \
  --git \
  --web \
  --mitm \
  --root ~/myproject \
  --target http://localhost:8000 \
  --max-commits 100 \
  --mitm-traffic mitm_traffic.ndjson \
  --output security_audit.json \
  --enable-rag

Query Command

cybersec-scanner query "your question" [OPTIONS]

Options:

--audit PATH - Audit report to build graph from (if graph doesn't exist)
--graph PATH - Existing knowledge graph file (default: rag/graph.gpickle)
--model NAME - Ollama model to use (default: gemma3:1b)
--top-k N - Number of findings to retrieve (default: 5)
--output PATH, -o PATH - Save LLM response to file

Example:

# Query with existing graph
cybersec-scanner query "What AWS credentials were found?"

# Build graph and query
cybersec-scanner query "List all high severity findings" --audit audit_report.json

# Use different model
cybersec-scanner query "Explain the security risks" --model llama3:8b

# Save response to file
cybersec-scanner query "Summarize critical findings" --output security_summary.txt

Individual Scanner Commands

Git Scanner:

cybersec-scanner scan-git [REPO_PATH] [OPTIONS]

# Options:
#   --max-commits N     Max commits to scan (default: 50)
#   --output PATH       Output JSON file

# Example:
cybersec-scanner scan-git . --max-commits 100 --output git_findings.json

Web Scanner:

cybersec-scanner scan-web URL [OPTIONS]

# Options:
#   --max-pages N       Max pages to crawl (default: 50)
#   --output PATH       Output JSON file

# Example:
cybersec-scanner scan-web http://localhost:3000 --max-pages 200

MITM Scanner:

cybersec-scanner scan-mitm TRAFFIC_FILE [OPTIONS]

# Options:
#   --output PATH       Output JSON file

# Example:
cybersec-scanner scan-mitm mitm_traffic.ndjson --output mitm_findings.json

Configuration File

Create a YAML config to avoid long command lines:

cybersec-scanner init-config --output my-config.yaml

Example my-config.yaml:

scanner:
  git:
    enabled: true
    root: "."
    max_commits: 100
  
  web:
    enabled: true
    target: "http://localhost:8000"
    max_pages: 300
  
  mitm:
    enabled: true
    traffic_file: "mitm_traffic.ndjson"
  
  runtime:
    enabled: false

output:
  file: "audit_report.json"

rag:
  enabled: true
  model: "gemma3:1b"

Usage:

cybersec-scanner scan --config my-config.yaml

Utility Commands

Build Knowledge Graph:

cybersec-scanner build-graph AUDIT_FILE [OPTIONS]

# Options:
#   --output PATH, -o    Output graph file (default: rag/graph.gpickle)

# Example:
cybersec-scanner build-graph audit_report.json --output my_graph.gpickle

Initialize Config File:

cybersec-scanner init-config [OPTIONS]

# Options:
#   --output PATH, -o    Output config file path (default: cybersec-config.yaml)

# Example:
cybersec-scanner init-config --output my-config.yaml

Show Version:

cybersec-scanner version

Install MITM Certificate:

cybersec-scanner install-cert [OPTIONS]

# Options:
#   --port PORT          MITM proxy port (informational, default: 8082)
#   --no-download        Skip HTTP download, use local cert only

# Example:
cybersec-scanner install-cert --port 8082

Start MITM Proxy:

cybersec-scanner start-proxy [OPTIONS]

# Options:
#   --port PORT          Proxy listen port (default: 8082)
#   --traffic-file PATH  Traffic log file path (default: temp dir auto-shared)

# Example:
cybersec-scanner start-proxy --port 9000 --traffic-file ./my_traffic.ndjson

📋 Required Files

IMPORTANT: Before running scans, you need these files adjacent to your working directory (where you run the scanner):

1. patterns.env (REQUIRED)

This file contains regex patterns for detecting secrets. Copy it from the repository root:

# If you installed from source
cp patterns.env /your/project/directory/

# If you installed from PyPI, download from GitHub
curl -o patterns.env https://raw.githubusercontent.com/AnubhavChoudhery/cybersec-scanner/main/patterns.env

The file includes 58+ detection patterns for major providers:

AWS_ACCESS_KEY_ID=AKIA[0-9A-Z]{16}
OPENAI_API_KEY=sk-[a-zA-Z0-9]{20,}
STRIPE_SECRET_KEY=sk_live_[0-9a-zA-Z]{24,}
GITHUB_TOKEN=ghp_[0-9a-zA-Z]{36}
# ... and 54 more patterns

Security Note: This file is excluded from git by default to avoid triggering security scanners. Never commit actual secrets to this file!

Chrome_Ext/
├── local_check.py              # Main orchestrator
├── config.py                   # Configuration and patterns
├── utils.py                    # Utility functions
├── patterns.env                # Secret detection patterns (user-configured)
├── inject_mitm_proxy.py        # MITM proxy injection module
├── install_mitm_cert.py        # Certificate installation helper
├── scanners/
│   ├── git_scanner.py         # Git history analysis
│   ├── web_crawler.py         # HTTP endpoint scanning
│   ├── browser_scanner.py     # Playwright runtime inspection
│   └── network_scanner.py     # MITM proxy traffic analysis
└── audit_report.json          # Output report (generated)

Installation

System Requirements

Python 3.8 or higher
Git (for git history scanning)
mitmproxy 10.0+ (for HTTPS inspection)
Modern web browser (for Playwright scanner)

Required Dependencies

pip install -r requirements.txt

If requirements.txt is not available, install manually:

pip install requests colorama

Optional Dependencies

For HTTPS Traffic Inspection

# Install mitmproxy
pip install mitmproxy

# Verify installation
mitmdump --version

For Browser Runtime Inspection

pip install playwright
python -m playwright install

For Network Packet Capture (Advanced)

pip install scapy

# Windows: Install Npcap from https://npcap.com/
# Linux/Mac: May require libpcap

Quick Start

Initial Setup

Clone or download the repository
Set up pattern file (REQUIRED before first run)

# Copy the patterns file template
cp patterns.env.example patterns.env

# The file includes 58+ detection patterns for major providers
# Edit patterns.env to customize or add patterns (optional)

Verify setup

python -c "from config import KNOWN_PATTERNS; print(f'Loaded {len(KNOWN_PATTERNS)} patterns')"

Expected output: Loaded 58 patterns (or similar)

Basic Usage

# Scan with default settings
python local_check.py --target http://localhost:8000 --root /path/to/project

# Generate audit report
cat audit_report.json

MITM Proxy Setup

The MITM (Man-in-the-Middle) proxy feature allows inspection of HTTPS traffic in real-time, including request/response headers and bodies.

Prerequisites

Install mitmproxy

pip install mitmproxy

# Verify installation
mitmdump --version

Copy required files to your backend

# From the Chrome_Ext directory
cp inject_mitm_proxy.py /path/to/your/backend/app/
cp patterns.env /path/to/your/backend/app/

Backend Integration

CRITICAL: Add the import statement as the VERY FIRST LINE of your main application file. This is not optional - the import MUST come before any other imports (Flask, FastAPI, Django, etc.) for the MITM proxy to properly intercept HTTP libraries.

The inject_mitm_proxy module automatically:

Starts a proxy server on port 8082 (configurable via MITM_PROXY_PORT)
Patches HTTP libraries (requests, httpx, urllib, urllib3, aiohttp) to route through proxy
Inspects all outbound HTTP/HTTPS traffic for security issues
Logs traffic to mitm_traffic.ndjson in the same directory
Bypasses specific domains (AWS, OAuth, AI providers) to prevent authentication issues

For FastAPI:

# backend/app/main.py
import inject_mitm_proxy  # MUST BE FIRST IMPORT (before FastAPI, before everything!)

from fastapi import FastAPI  # This comes AFTER inject_mitm_proxy
from fastapi.middleware.cors import CORSMiddleware
# ... rest of your imports

app = FastAPI()
# ... rest of your code

For Flask:

# backend/app.py
import inject_mitm_proxy  # MUST BE FIRST IMPORT (before Flask, before everything!)

from flask import Flask  # This comes AFTER inject_mitm_proxy
from flask_cors import CORS
# ... rest of your imports

app = Flask(__name__)
# ... rest of your code

For Django:

# backend/manage.py or wsgi.py
import inject_mitm_proxy  # MUST BE FIRST IMPORT (before Django, before everything!)

import os  # This comes AFTER inject_mitm_proxy
from django.core.wsgi import get_wsgi_application
# ... rest of Django setup

Why FIRST import matters: The module patches HTTP libraries at import time. If Flask/FastAPI/Django import first, their HTTP clients won't be patched, and traffic won't be intercepted.

Running with MITM Proxy

Start your backend application

# No environment variables needed - proxy is always enabled
# Just start your backend normally
uvicorn app.main:app --reload  # FastAPI example

You should see:

[MITM] Proxy active on http://127.0.0.1:8082
[MITM] Bypass mode: AWS, OAuth, AI providers, payments, CDNs
[MITM] Patched libraries: requests, httpx, urllib, urllib3, aiohttp

Run the security scanner

# In a new terminal, run the scanner with MITM enabled
python local_check.py \
  --target http://localhost:8000 \
  --enable-mitm \
  --mitm-port 8082

Interact with your application (make HTTP requests, use API endpoints, etc.)
Stop the scanner (Ctrl+C) to generate the audit report
Review results

# View audit report
cat audit_report.json

# View traffic log (raw NDJSON)
cat mitm_traffic.ndjson

MITM Proxy Detection Capabilities

The MITM proxy inspects both requests and responses for security issues:

Request-Side Detection:

Credentials embedded in URLs (user:pass@domain)
API keys in query parameters (?api_key=xxx)
Basic Authentication headers (base64 credentials)
API keys in Authorization headers (with context awareness)
Plaintext passwords in request bodies (excludes bcrypt/argon2 hashes)
Secrets matching any of the 58+ patterns

Response-Side Detection:

Secrets leaked in response headers
API keys in response bodies (JSON, HTML, JavaScript)
Credentials in error messages
Database connection strings in stack traces
Debug information containing sensitive data

Severity Levels:

CRITICAL: API keys in URLs, credentials over HTTP, plaintext passwords
HIGH: API keys in headers over HTTPS (with expected auth disclaimer)
INFO: Normal traffic logging (not a security issue)

MITM Proxy Configuration

The inject_mitm_proxy.py module works automatically when imported. The only optional configuration is:

# Set custom MITM proxy port (default: 8082)
export MITM_PROXY_PORT=9000

No other environment variables needed - the proxy runs in full mode by default with intelligent domain bypass.

Domain Bypass Configuration

By default, the following domains bypass the MITM proxy to prevent authentication and SSL issues:

OAuth Providers:

accounts.google.com, oauth2.googleapis.com, login.microsoftonline.com

AI Providers:

api.openai.com, openai.com
api.anthropic.com, anthropic.com
api.groq.com, groq.com
api.mistral.ai, mistral.ai
api-inference.huggingface.co, huggingface.co
api.cohere.ai, replicate.com, together.xyz, anyscale.com, perplexity.ai

AWS Services:

All *.amazonaws.com domains
API Gateway, Lambda, S3, CloudFront

Payment Providers:

stripe.com, paypal.com

CDNs:

cloudflare.com, cloudfront.net

Localhost:

127.0.0.1, localhost

To modify bypass rules, edit the BYPASS_DOMAINS and AWS_SUFFIXES sets in inject_mitm_proxy.py.

Uninstalling MITM Proxy

To remove MITM proxy from your backend:

Remove or comment out the import:

# import inject_mitm_proxy  # Disabled

Restart your backend application

The proxy is only active when the module is imported.

Configuration

Pattern File (patterns.env)

The patterns.env file contains regular expressions for detecting secrets. This file is excluded from version control to prevent triggering GitHub security alerts.

Format:

PATTERN_NAME=regex_pattern

Adding custom patterns:

# Edit patterns.env
nano patterns.env

# Add your pattern
MY_CUSTOM_KEY=mykey_[0-9a-f]{32}

# Reload the scanner
python local_check.py --target http://localhost:8000

Configuration File (config.py)

Entropy Threshold:

ENTROPY_THRESHOLD = 3.5  # Shannon entropy for randomness detection

File Exclusions:

EXCLUDE_SUFFIXES = {
    '.png', '.jpg', '.jpeg', '.gif', '.bmp', '.ico',
    '.zip', '.tar', '.gz', '.pdf', '.exe', '.dll'
}

Probe Paths (for web crawler):

PROBE_PATHS = [
    '/.env', '/.env.local', '/.env.production',
    '/.git/config', '/.git/HEAD',
    '/config.php.bak', '/backup.sql'
]

Usage

Command-Line Options

python local_check.py [OPTIONS]

Core Options:

Option	Type	Default	Description
`--target`, `-t`	URL	`http://localhost:8000`	Target application URL
`--root`, `-r`	Path	`.`	Repository root for static analysis
`--out`, `-o`	Path	`audit_report.json`	Output report filename

Scanner Options:

Option	Type	Default	Description
`--depth`	Integer	`300`	Maximum pages to crawl
`--enable-playwright`	Flag	`False`	Enable browser runtime inspection
`--enable-pcap`	Flag	`False`	Enable packet capture (requires root)
`--pcap-timeout`	Integer	`12`	Packet capture duration (seconds)

MITM Proxy Options:

Option	Type	Default	Description
`--enable-mitm`	Flag	`False`	Enable MITM proxy for HTTPS inspection
`--mitm-port`	Integer	`8082`	MITM proxy port
`--mitm-duration`	Integer	`0`	Auto-stop after N seconds (0 = manual)
`--mitm-traffic`	Path	Auto-detect	Custom path to traffic NDJSON file

Usage Examples

Basic scan:

python local_check.py --target http://localhost:8000 --root /path/to/project

Full scan with all features:

python local_check.py \
  --target http://localhost:3000 \
  --root ~/myapp \
  --enable-playwright \
  --enable-mitm \
  --depth 500 \
  --out security_report.json

MITM-only scan (skip static/git):

python local_check.py \
  --target http://localhost:8000 \
  --enable-mitm \
  --mitm-duration 30

Custom traffic log location:

python local_check.py \
  --target http://localhost:8000 \
  --enable-mitm \
  --mitm-traffic /custom/path/to/traffic.ndjson

Scanner Modules

1. Git Scanner (`scanners/git_scanner.py`)

Analyzes git commit history for leaked secrets using efficient pickaxe search.

Features:

Searches git history for known secret patterns
Uses git log -S<term> for 100x faster scanning than naive approaches
Examines up to 100 commits by default (configurable)
Scans added lines in diffs for pattern matches

Configuration:

scan_git_history(root, max_commits=100)

2. Web Crawler (`scanners/web_crawler.py`)

Crawls web application endpoints to discover exposed sensitive paths and analyze client-side code.

Features:

Discovers exposed .env, .git/config, backup files
Analyzes JavaScript files for hardcoded secrets
Extracts and scans source maps
Checks HTTP headers and cookies for leaked secrets
Detects catch-all responses (false positives)
Multi-threaded crawling with process pool for regex scanning

Configuration:

crawler = LocalCrawler(
    base="http://localhost:8000",
    timeout=6,
    max_pages=300,
    workers=8,
    max_js_size=500_000  # Skip large JS bundles
)

3. Browser Scanner (`scanners/browser_scanner.py`)

Uses Playwright to inspect browser runtime state and client-side storage.

Features:

Extracts localStorage contents
Extracts sessionStorage contents
Retrieves all cookies
Checks global variables (window.__ENV, window.config, window.API_KEY)

Requirements:

pip install playwright
python -m playwright install

Usage:

playwright_inspect("http://localhost:8000")

4. Network Scanner (`scanners/network_scanner.py`)

Runs mitmproxy addon for deep packet inspection (Layer 2).

Features:

Intercepts HTTP/HTTPS traffic at the proxy level
Pattern matching on request/response bodies
Security header validation
Works alongside inject_mitm_proxy.py (Layer 1)

Note: Most users will use inject_mitm_proxy.py for MITM inspection. This module provides additional addon-based analysis.

Output Format

Audit Report (audit_report.json)

{
  "timestamp": "2025-11-18T13:34:34.106644",
  "target": "http://localhost:8000",
  "stats": {
    "git_secrets": 0,
    "crawler_issues": 2,
    "browser_issues": 0,
    "mitm_proxied": 15,
    "mitm_bypassed": 3,
    "mitm_security_findings": 1
  },
  "severities": {
    "CRITICAL": 0,
    "HIGH": 1,
    "MEDIUM": 0,
    "LOW": 0,
    "INFO": 15
  },
  "findings": [
    {
      "type": "api_key_in_header",
      "severity": "HIGH",
      "timestamp": 1763494461,
      "timestamp_human": "2025-11-18 13:34:21",
      "description": "GROQ_API_KEY in Authorization header over HTTPS (expected for server-side API calls, review if unexpected)",
      "url": "https://api.groq.com/openai/v1/chat/completions",
      "client": "requests",
      "method": "post",
      "pattern": "GROQ_API_KEY",
      "header": "Authorization"
    }
  ]
}

Traffic Log (mitm_traffic.ndjson)

NDJSON (newline-delimited JSON) format for append-only logging:

{"ts": 1763494398, "timestamp": "2025-11-18 13:33:18", "stage": "mitm_outbound", "client": "requests", "method": "post", "url": "https://api.example.com/endpoint"}
{"ts": 1763494461, "timestamp": "2025-11-18 13:34:21", "stage": "security_finding", "severity": "HIGH", "type": "api_key_in_header", "pattern": "GROQ_API_KEY", "description": "...", "url": "...", "client": "requests", "method": "post", "header": "Authorization"}

Stages:

mitm_outbound: Request sent through proxy
mitm_bypass: Request bypassed proxy (OAuth, AWS, etc.)
security_finding: Security issue detected

Advanced Usage

Custom Pattern Detection

Create a custom pattern file:

# Create custom-patterns.env
cat > custom-patterns.env << EOF
CUSTOM_API_KEY=custom_[0-9a-f]{32}
INTERNAL_TOKEN=int_tok_[A-Za-z0-9]{24}
EOF

# Edit config.py to load from custom file
# (Modify PATTERNS_FILE path in config.py)

Integrating with CI/CD

# .github/workflows/security-scan.yml
name: Security Audit
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      - run: pip install -r requirements.txt
      - run: cp patterns.env.example patterns.env
      - run: python local_check.py --target http://localhost:8000 --root .
      - run: |
          if jq -e '.severities.CRITICAL > 0' audit_report.json; then
            echo "CRITICAL issues found!"
            exit 1
          fi

Programmatic Usage

from scanners import scan_git_history, LocalCrawler, playwright_inspect

# Git scanning
git_findings = scan_git_history("/path/to/repo", max_commits=100)

# Web crawling
crawler = LocalCrawler("http://localhost:8000", max_pages=200)
crawler.probe_common_paths()
crawler.crawl()
web_findings = crawler.findings

# Browser inspection
browser_data = playwright_inspect("http://localhost:8000")

# Combine results
all_findings = git_findings + web_findings

Troubleshooting

"No module named 'requests'"

pip install requests

"patterns.env not found"

cp patterns.env.example patterns.env

"playwright-not-installed"

pip install playwright
python -m playwright install

"MITM proxy not loading patterns"

Issue: Backend shows WARNING: patterns.env not found

Solution:

# Verify patterns.env is in the same directory as inject_mitm_proxy.py
ls -la /path/to/backend/app/patterns.env

# If missing, copy it
cp patterns.env /path/to/backend/app/

"MITM proxy not intercepting traffic"

Issue: No traffic logged in mitm_traffic.ndjson

Solutions:

Verify import is present and FIRST:

import inject_mitm_proxy  # MUST BE FIRST
# ... other imports
python app.py
# Should see: "[MITM] Proxy active on http://127.0.0.1:8082"

Check proxy port matches:

# Scanner
python local_check.py --enable-mitm --mitm-port 8082

# Backend
export MITM_PROXY_PORT=8082

"Permission denied during packet capture"

# Linux/Mac
sudo python local_check.py --enable-pcap

# Windows
# Run terminal as Administrator

"Git scan is very slow"

This is normal for large repositories (100k+ commits). The tool limits to 100 commits by default. To adjust:

# Modify scanners/git_scanner.py
scan_git_history(root, max_commits=50)  # Reduce commit limit

"Too many false positives"

Adjust entropy threshold in config.py:

ENTROPY_THRESHOLD = 4.0  # Higher = fewer false positives

Add exclusions for known patterns:

# In config.py
EXCLUDE_PATTERNS = [
    r'test_api_key_123',  # Test keys
    r'example\.com',      # Example domains
]

Filter by severity in audit report:

# Only show CRITICAL issues
jq '.findings[] | select(.severity == "CRITICAL")' audit_report.json

Security Considerations

Testing Your Own Applications Only

This tool is designed for security testing of applications you own or have explicit permission to test. Unauthorized scanning may violate laws and terms of service.

MITM Proxy Security

The MITM proxy disables SSL verification for testing purposes. This should only be used in development/testing environments, never in production.

Do NOT:

Use MITM proxy in production environments
Commit inject_mitm_proxy.py import to production code
Share MITM proxy logs (may contain sensitive data)

Best Practices:

Use environment variables to control MITM activation
Keep mitm_traffic.ndjson and audit_report.json out of version control (add to .gitignore)
Review and sanitize audit reports before sharing

Pattern File Security

The patterns.env file is excluded from version control by default (.gitignore) to avoid triggering GitHub security alerts on pattern signatures.

Do NOT:

Commit patterns.env to public repositories
Include actual secret values in pattern files
Share pattern files with untrusted parties

Version History

Version	Changes
1.0.5	Enhanced scan output with phase headers, fixed query output to be human-readable (extracts text from response)
1.0.4	Colored CLI output with colorama, `--output` flag for query command, simplified MITM traffic auto-sharing
1.0.3	Unified MITM traffic file location via temp directory, added colorama dependency
1.0.2	Fixed MITM traffic file path resolution bug
1.0.1	Initial PyPI release with full scanner suite

License

MIT License - See LICENSE file for details.

Contributing

Contributions are welcome! Please follow these guidelines:

Test your changes with multiple target applications
Update documentation for new features
Follow existing code style and structure
Add tests for new scanner modules
Ensure no secrets are committed in test files

Disclaimer

This tool is provided for lawful security testing only. Users are responsible for ensuring they have proper authorization before scanning any application. The authors assume no liability for misuse or unauthorized access.

Testing

Quick Test Commands

# Run all tests (auto-detects Ollama)
python run_tests.py

# Run all tests including LLM (requires Ollama)
python run_tests.py --all

# Fast tests only (no LLM)
python run_tests.py --fast

# With coverage report
python run_tests.py --coverage

# Specific test file
python run_tests.py --file retriever

Test Prerequisites

Core tests (no additional setup):

pip install pytest pytest-cov
pytest tests/ -v -k "not llm_client"

LLM tests (requires Ollama):

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh  # Linux/Mac
# Or download from https://ollama.com for Windows

# Pull model
ollama pull gemma3:1b

# Run all tests
pytest tests/ -v

Test Coverage

Component	Tests	Coverage
Knowledge Graph	PASS 1 test	100%
CWE Enrichment	PASS 1 test	100%
Database Normalizer	PASS 5 tests	95%
Graph Retriever	PASS 8 tests	100%
LLM Client	PASS 8 tests	85%
End-to-End Pipeline	PASS 2 tests	Full flow
Total	24 tests	~90%

See tests/README.md for detailed testing documentation.

Support

For issues, questions, or contributions:

Open an issue on GitHub
Review existing issues before creating new ones
Provide detailed information (OS, Python version, error messages, steps to reproduce)

Made by the JBAC EdtEch Team (Jai Ansh Bindra and Anubhav Choudhery)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.5

Nov 27, 2025

0.1.0

Nov 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cybersec_scanner-1.0.5.tar.gz (157.5 kB view details)

Uploaded Nov 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cybersec_scanner-1.0.5-py3-none-any.whl (73.0 kB view details)

Uploaded Nov 27, 2025 Python 3

File details

Details for the file cybersec_scanner-1.0.5.tar.gz.

File metadata

Download URL: cybersec_scanner-1.0.5.tar.gz
Upload date: Nov 27, 2025
Size: 157.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cybersec_scanner-1.0.5.tar.gz
Algorithm	Hash digest
SHA256	`ec90686e96154a12fa95ffb83eb09c6c0b8a5d44b0a5a50597fe6280c34405c6`
MD5	`685d074b724b9442c6084f3d6290a2c2`
BLAKE2b-256	`a3bc44b6e4186feecae0ea32011aa0a33e347e650cdf9cdc83b2c36060532b83`

See more details on using hashes here.

File details

Details for the file cybersec_scanner-1.0.5-py3-none-any.whl.

File metadata

Download URL: cybersec_scanner-1.0.5-py3-none-any.whl
Upload date: Nov 27, 2025
Size: 73.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for cybersec_scanner-1.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b5798f160cae57eb76e6e39caaf8c585c76dabe8294ea9397dc806e5150dfe37`
MD5	`978a1c78d889835ea63d4345b0a2db20`
BLAKE2b-256	`0b2c82c4bebee92d1e887151e97a6b9b44f5e11adf8310d4bc14970ba8832632`

See more details on using hashes here.

cybersec-scanner 1.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CyberSec Scanner

Table of Contents

Features

Multi-Scanner Architecture

RAG-Powered Analysis

Detection Coverage

Flexible Usage

Installation

From PyPI (Recommended)

From Source

Optional Dependencies

System Requirements

Quick Start

Prerequisites for RAG Queries

Complete Scan-to-Query Workflow

CLI Usage

MITM Proxy Workflow

Python SDK Usage

CLI Command Reference

Available Commands

Scan Command Options

Query Command

Individual Scanner Commands

Configuration File

Utility Commands

📋 Required Files

1. patterns.env (REQUIRED)

Installation

System Requirements

Required Dependencies

Optional Dependencies

For HTTPS Traffic Inspection

For Browser Runtime Inspection

For Network Packet Capture (Advanced)

Quick Start

Initial Setup

Basic Usage

MITM Proxy Setup

Prerequisites

Backend Integration

Running with MITM Proxy

MITM Proxy Detection Capabilities

MITM Proxy Configuration

Domain Bypass Configuration

Uninstalling MITM Proxy

Configuration

Pattern File (patterns.env)

Configuration File (config.py)

Usage

Command-Line Options

Usage Examples

Scanner Modules

1. Git Scanner (scanners/git_scanner.py)

2. Web Crawler (scanners/web_crawler.py)

3. Browser Scanner (scanners/browser_scanner.py)

4. Network Scanner (scanners/network_scanner.py)

Output Format

Audit Report (audit_report.json)

Traffic Log (mitm_traffic.ndjson)

Advanced Usage

Custom Pattern Detection

Integrating with CI/CD

Programmatic Usage

Troubleshooting

"No module named 'requests'"

"patterns.env not found"

"playwright-not-installed"

"MITM proxy not loading patterns"

"MITM proxy not intercepting traffic"

"Permission denied during packet capture"

"Git scan is very slow"

1. Git Scanner (`scanners/git_scanner.py`)

2. Web Crawler (`scanners/web_crawler.py`)

3. Browser Scanner (`scanners/browser_scanner.py`)

4. Network Scanner (`scanners/network_scanner.py`)