Skip to main content

Data anonymization CLI tool

Project description

Lethe

Pseudo-anonymization CLI for structured files and SQL dumps. Detect and replace PII in CSV, TSV, plain text, and SQL dump files using Presidio and spaCy NER, with Faker-generated replacements that stay consistent across your dataset.

Lethe performs pseudo-anonymization: PII is replaced with realistic fake values, preserving data structure and relationships. This is different from true anonymization, which irreversibly removes personal data. See Architecture docs for the full distinction and GDPR implications.

Install

pip install lethe-cli
python -m spacy download en_core_web_trf

For a faster, lighter model instead of the transformer:

python -m spacy download en_core_web_sm

Usage

Anonymize

Replace detected PII with consistent fake values:

lethe anonymize data.csv -o anonymized.csv
lethe anonymize data.csv --model sm --threshold 0.7
lethe anonymize notes.txt -o clean.txt --locale nl_NL

Multiply

Generate synthetic rows from an existing dataset:

lethe multiply data.csv --factor 5 -o expanded.csv
lethe multiply data.csv --factor 10 --sanitize --seed 42

Options

Run lethe anonymize --help or lethe multiply --help for the full list of options.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lethe_cli-0.2.1.tar.gz (70.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lethe_cli-0.2.1-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file lethe_cli-0.2.1.tar.gz.

File metadata

  • Download URL: lethe_cli-0.2.1.tar.gz
  • Upload date:
  • Size: 70.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.2.1.tar.gz
Algorithm Hash digest
SHA256 59692205099615fb8fc1a0c496e0644a03c67c77bcd7f7d211f5f32db046744d
MD5 5e066eabe3c31423fff8d36fd7d3f4ff
BLAKE2b-256 8e7db805c162e7ca906223641d22d0260566a759631ee6be670580f1b2174ae3

See more details on using hashes here.

File details

Details for the file lethe_cli-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: lethe_cli-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3f250d92e03d238cacd8daa232d570c607ee1813da67fa8a517bc393397718d8
MD5 23436fbe4216cbc7afd7d4f6d2893344
BLAKE2b-256 0543a45bcc5b3068ef74b2d6c069fd849c17ff70819e17d2da139f3593bd3097

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page