Skip to main content

Local AI-safe redaction engine for security data

Project description

AI Safe Redaction Engine

A local-first, modular redaction system for security data before AI processing.

Single-Word CLI Name

This project is packaged as:

  • Command: redsafe
  • Package name: redsafe

PyPI:

Supported Inputs

  • Burp Suite XML exports
  • Burp project/session files (.burp) via best-effort binary HTTP extraction
  • Raw HTTP requests/responses
  • Network logs
  • Basic PCAP parsing
  • Screenshots (OCR + masking)

Features

  • Analysis-based sensitive data detection (not regex-only)
  • Named Entity Recognition (spaCy)
  • Secret detection with heuristics and entropy scoring
  • Context-aware detection from header/parameter names
  • Consistent placeholder mapping (<EMAIL_1>, <JWT_TOKEN_1>, etc.)
  • Local-only processing, no external API calls

Project Structure

  • parsers/: input ingestion modules
  • detection/: entity, secret, entropy, and context detectors
  • redaction/: placeholder and redaction logic
  • core/: data models and orchestration pipeline
  • utils/: file/encoding helpers
  • tests/: sample files + pytest coverage

Install With pipx (PyPI)

pipx install redsafe

Run:

redsafe --input tests/sample_burp.xml --type burp
redsafe --input tests/sample_http.txt --type http
redsafe --input tests/sample_log.txt --type log
redsafe --input tests/sample_image.png --type image

Upgrade:

pipx upgrade redsafe

Install From GitHub (Latest Source)

pipx install git+https://github.com/sam1101-sys/ai-redaction-engine.git

Install With venv (Alternative)

cd /home/gss/Desktop/Codex/ai-redaction-engine
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Run examples:

python main.py --input tests/sample_burp.xml --type burp
python main.py --input tests/sample_http.txt --type http
python main.py --input tests/sample_log.txt --type log
python main.py --input tests/sample_image.png --type image

Outputs are written to sanitized_output/.

Redaction Tuning (False Positive Control)

Secret entropy detection can be tuned via environment variables:

export REDACTION_ENTROPY_THRESHOLD=4.2
export REDACTION_MIN_SECRET_LEN=24
export REDACTION_MIN_BASE64_LEN=28
export REDACTION_IGNORE_VALUES="application/x-www-form-urlencoded,text/plain"

These values are consumed by SecretDetectionConfig in detection/secret_detection.py.

Run Tests

pytest -q

Notes

  • Designed for integration into a future AI pentesting engine.
  • All processing is local.
  • If en_core_web_sm is unavailable, regex/heuristic detection still works.
  • Image redaction needs local opencv-python, pytesseract, and system tesseract binary installed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redsafe-0.1.1.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redsafe-0.1.1-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file redsafe-0.1.1.tar.gz.

File metadata

  • Download URL: redsafe-0.1.1.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c4ae85b9464d637088fb2b85412a0df45cba91b173a28dfcc9455d7abb50fc81
MD5 7723f69def185783ed8985c457b2f24b
BLAKE2b-256 a67d122df6df83a3fd94bea34cab6a50094b9c8e75d83f3c61bc5f78d4da8e84

See more details on using hashes here.

File details

Details for the file redsafe-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: redsafe-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 05983db4ff35e92e96effd06ae029b2f73c8c9f1fecda51fc809f5c6170b53de
MD5 b1ceb7b517726b45889d8e4b4185e8cd
BLAKE2b-256 a7a9acd59b4e2bca25d8b59975b39481fa24fa02ba86e4d3c78018e5c62031d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page