Skip to main content

Data anonymization CLI tool

Project description

Lethe

Pseudo-anonymization CLI for structured files and SQL dumps. Detect and replace PII in CSV, TSV, plain text, and SQL dump files using Presidio and spaCy NER, with Faker-generated replacements that stay consistent across your dataset.

Lethe performs pseudo-anonymization: PII is replaced with realistic fake values, preserving data structure and relationships. This is different from true anonymization, which irreversibly removes personal data. See Architecture docs for the full distinction and GDPR implications.

Install

pip install lethe-cli
python -m spacy download en_core_web_trf

For a faster, lighter model instead of the transformer:

python -m spacy download en_core_web_sm

Usage

Anonymize

Replace detected PII with consistent fake values:

lethe anonymize data.csv -o anonymized.csv
lethe anonymize data.csv --model sm --threshold 0.7
lethe anonymize notes.txt -o clean.txt --locale nl_NL

Multiply

Generate synthetic rows from an existing dataset:

lethe multiply data.csv --factor 5 -o expanded.csv
lethe multiply data.csv --factor 10 --sanitize --seed 42

Options

Run lethe anonymize --help or lethe multiply --help for the full list of options.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lethe_cli-0.2.0.tar.gz (69.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lethe_cli-0.2.0-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file lethe_cli-0.2.0.tar.gz.

File metadata

  • Download URL: lethe_cli-0.2.0.tar.gz
  • Upload date:
  • Size: 69.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0900030744946efebc5ff143bf0d962ab48799a071ccf712a81678c6b76ede41
MD5 e586e8e8a119b440cc78c9676b6f209c
BLAKE2b-256 219ee5ece689dfbcec522ea4bc21e6134270e76d0d18bbbd4d888319fa5f51a5

See more details on using hashes here.

File details

Details for the file lethe_cli-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: lethe_cli-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for lethe_cli-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0453dec17cd3e4cd6095bdca6faa6778492952e9a787cfbe75c9f91f014439a9
MD5 c84ad0a72ad7f8076878e2fa077404bb
BLAKE2b-256 d47f157928be2bcfc5eefea5525f4710bf1ab44539a514ec2607f8f4be13b2c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page