Skip to main content

Data anonymization CLI tool

Project description

Lethe

Data anonymization CLI for structured files and SQL dumps. Scan for PII with compliance framework mapping, then anonymize using Presidio and spaCy NER with Faker-generated replacements that stay consistent across your dataset.

Lethe performs pseudo-anonymization: PII is replaced with realistic fake values, preserving data structure and relationships. This is different from true anonymization, which irreversibly removes personal data. See Architecture docs for the full distinction and GDPR implications.

Install

pip install lethe-cli
python -m spacy download en_core_web_trf

For a faster, lighter model instead of the transformer:

python -m spacy download en_core_web_sm

Usage

Scan

Analyze a file for PII and map findings to compliance frameworks (GDPR, HIPAA, CCPA, PCI-DSS, SOX, FERPA). No data is modified.

lethe scan data.csv                    # quick scan with Rich report
lethe scan data.csv --json             # machine-readable JSON output
lethe scan data.csv --model trf        # accurate transformer model

Anonymize

Replace detected PII with consistent fake values:

lethe anonymize data.csv -o anonymized.csv
lethe anonymize data.csv --model sm --threshold 0.7
lethe anonymize notes.txt -o clean.txt --locale nl_NL

Multiply

Generate synthetic rows from an existing dataset:

lethe multiply data.csv --factor 5 -o expanded.csv
lethe multiply data.csv --factor 10 --sanitize --seed 42

Options

Run lethe scan --help, lethe anonymize --help, or lethe multiply --help for the full list of options.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lethe_cli-0.3.0.tar.gz (78.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lethe_cli-0.3.0-py3-none-any.whl (31.9 kB view details)

Uploaded Python 3

File details

Details for the file lethe_cli-0.3.0.tar.gz.

File metadata

  • Download URL: lethe_cli-0.3.0.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0ef517e6daae807a934e7c13d3f92ccffac4e28b9e69ab73e2628fed9921b92d
MD5 6978fe2b2f0f67a44b6ee1f2766a865e
BLAKE2b-256 78c9732e45ca92cf0c659bdca316098037f0511927b879f8b59b7b704db1456e

See more details on using hashes here.

File details

Details for the file lethe_cli-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lethe_cli-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 31.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d402997fd1cc2f46b50cfa4c38c14b500bd59e1eac0bcd40e2cbfb5b0d14c4a
MD5 7e9dbd87320f52c4be1795ecbaf6947d
BLAKE2b-256 d44466ef560795de1d04af3bad145c3e74fb18ab97e62fb38278e64a93a0d8e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page