Skip to main content

Local AI-safe redaction engine for security data

Project description

AI Safe Redaction Engine

A local-first, modular redaction system for security data before AI processing.

Single-Word CLI Name

This project is packaged for pipx as:

  • Command: redsafe
  • Package name: redsafe

Other good single-word alternatives if you want to rename later:

  • safescrub
  • cloaknet
  • privashield

Supported Inputs

  • Burp Suite XML exports
  • Raw HTTP requests/responses
  • Network logs
  • Basic PCAP parsing
  • Screenshots (OCR + masking)

Features

  • Analysis-based sensitive data detection (not regex-only)
  • Named Entity Recognition (spaCy)
  • Secret detection with heuristics and entropy scoring
  • Context-aware detection from header/parameter names
  • Consistent placeholder mapping (<EMAIL_1>, <JWT_TOKEN_1>, etc.)
  • Local-only processing, no external API calls

Project Structure

  • parsers/: input ingestion modules
  • detection/: entity, secret, entropy, and context detectors
  • redaction/: placeholder and redaction logic
  • core/: data models and orchestration pipeline
  • utils/: file/encoding helpers
  • tests/: sample files + pytest coverage

Install With pipx (Recommended)

cd /home/gss/Desktop/Codex/ai-redaction-engine
pipx install .

Run:

redsafe --input tests/sample_burp.xml --type burp
redsafe --input tests/sample_http.txt --type http
redsafe --input tests/sample_log.txt --type log
redsafe --input tests/sample_image.png --type image

To reinstall after local code changes:

pipx reinstall redsafe

Install With venv (Alternative)

cd /home/gss/Desktop/Codex/ai-redaction-engine
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Run examples:

python main.py --input tests/sample_burp.xml --type burp
python main.py --input tests/sample_http.txt --type http
python main.py --input tests/sample_log.txt --type log
python main.py --input tests/sample_image.png --type image

Outputs are written to sanitized_output/.

Redaction Tuning (False Positive Control)

Secret entropy detection can be tuned via environment variables:

export REDACTION_ENTROPY_THRESHOLD=4.2
export REDACTION_MIN_SECRET_LEN=24
export REDACTION_MIN_BASE64_LEN=28
export REDACTION_IGNORE_VALUES="application/x-www-form-urlencoded,text/plain"

These values are consumed by SecretDetectionConfig in detection/secret_detection.py.

Run Tests

pytest -q

Example: Burp XML Sanitization

Input Burp request header:

Authorization: Bearer eyJhbGciOiJIUzI1Ni.eyJzdWIiOiIxMjM0NTYifQ.signature123456789
Cookie: sessionid=abcDEF1234567890

Sanitized output:

Authorization: Bearer <JWT_TOKEN_1>
Cookie: sessionid=<SESSION_COOKIE_1>

Example: HTTP Sanitization

Input:

POST /login HTTP/1.1
Host: app.local
Content-Type: application/x-www-form-urlencoded

email=john@example.com&password=SuperSecret123

Sanitized output:

POST /login HTTP/1.1
Host: app.local
Content-Type: application/x-www-form-urlencoded

email=<EMAIL_1>&password=<PASSWORD_1>

Example: Image Sanitization

  • OCR extracts text from screenshot.
  • Detection engine flags sensitive values.
  • Matching OCR boxes are masked in output image.

Notes

  • Designed for integration into a future AI pentesting engine.
  • All processing is local.
  • If en_core_web_sm is unavailable, regex/heuristic detection still works.
  • Image redaction needs local opencv-python, pytesseract, and system tesseract binary installed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redsafe-0.1.0.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redsafe-0.1.0-py3-none-any.whl (16.9 kB view details)

Uploaded Python 3

File details

Details for the file redsafe-0.1.0.tar.gz.

File metadata

  • Download URL: redsafe-0.1.0.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1d682e859a642e7cf9e59537361e715c309852149ff1327424cdb90b9d8df7bb
MD5 b01a7b4211789ae4ec1f8697433ee783
BLAKE2b-256 0563e8b39ed97a393de8fbca1d28a05c1cdb26a2ca5b7aa706ba90b704d9a014

See more details on using hashes here.

File details

Details for the file redsafe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: redsafe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d3b467be11d25164bc6ba8f27009f43097255c345dfe399a718094f6a5566b9d
MD5 c7642232aff79044ffd2f9dbec353e03
BLAKE2b-256 4cde7e04fe6f5dcdefec19f571220c16c62a644bcea55c291a805ef53a9dd4f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page