Skip to main content

Local AI-safe redaction engine for security data

Project description

AI Safe Redaction Engine

A local-first, modular redaction system for security data before AI processing.

Single-Word CLI Name

This project is packaged as:

  • Command: redsafe
  • Package name: redsafe

PyPI:

Supported Inputs

  • Burp Suite XML exports
  • Burp project/session files (.burp) via best-effort binary HTTP extraction
  • Raw HTTP requests/responses
  • Network logs
  • Basic PCAP parsing
  • Screenshots (OCR + masking)

Features

  • Analysis-based sensitive data detection (not regex-only)
  • Named Entity Recognition (spaCy)
  • Secret detection with heuristics and entropy scoring
  • Context-aware detection from header/parameter names
  • Consistent placeholder mapping (<EMAIL_1>, <JWT_TOKEN_1>, etc.)
  • Local-only processing, no external API calls

Project Structure

  • parsers/: input ingestion modules
  • detection/: entity, secret, entropy, and context detectors
  • redaction/: placeholder and redaction logic
  • core/: data models and orchestration pipeline
  • utils/: file/encoding helpers
  • tests/: sample files + pytest coverage

Install With pipx (PyPI)

pipx install redsafe

Run:

redsafe --input tests/sample_burp.xml --type burp
redsafe --input tests/sample_http.txt --type http
redsafe --input tests/sample_log.txt --type log
redsafe --input tests/sample_image.png --type image

Upgrade:

pipx upgrade redsafe

Install From GitHub (Latest Source)

pipx install git+https://github.com/sam1101-sys/ai-redaction-engine.git

Install With venv (Alternative)

cd /home/gss/Desktop/Codex/ai-redaction-engine
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm

Run examples:

python main.py --input tests/sample_burp.xml --type burp
python main.py --input tests/sample_http.txt --type http
python main.py --input tests/sample_log.txt --type log
python main.py --input tests/sample_image.png --type image

Outputs are written to sanitized_output/.

Redaction Tuning (False Positive Control)

Secret entropy detection can be tuned via environment variables:

export REDACTION_ENTROPY_THRESHOLD=4.2
export REDACTION_MIN_SECRET_LEN=24
export REDACTION_MIN_BASE64_LEN=28
export REDACTION_IGNORE_VALUES="application/x-www-form-urlencoded,text/plain"

These values are consumed by SecretDetectionConfig in detection/secret_detection.py.

Run Tests

pytest -q

Notes

  • Designed for integration into a future AI pentesting engine.
  • All processing is local.
  • If en_core_web_sm is unavailable, regex/heuristic detection still works.
  • Image redaction needs local opencv-python, pytesseract, and system tesseract binary installed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redsafe-0.1.2.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redsafe-0.1.2-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file redsafe-0.1.2.tar.gz.

File metadata

  • Download URL: redsafe-0.1.2.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.2.tar.gz
Algorithm Hash digest
SHA256 df6584426166374d2706fb7cc5e55ecb1c9e5c25fb1e70dbd9da10d7a6ba4c49
MD5 4824f2866ac152a420d9e3568450d3b4
BLAKE2b-256 9a904c0d1d0fd7f51409d7ca1d9ed8871ad756938fa3636f29fc229c94652939

See more details on using hashes here.

File details

Details for the file redsafe-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: redsafe-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 19.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for redsafe-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 83a7ed61300c5b764b5c83fd5a578d16fc0666de33b2656b0b0d8d2edfae4417
MD5 5083796dca60e683cf047eea9d4bb903
BLAKE2b-256 e3a228c4116c10623c26828091037bb6bbf320ed588a28ec41cdac20906a2eaf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page