Skip to main content

detect PII in EDC projects

Project description

uv pip install edc-detect-pii

or just run the tool

uv run edc_detect_pii.py <OPTIONS>

So far this just looks for names.

The default regex looks for any word in CAPS greater than two letters and may have spaces between words.

Two areas that are at risk of exposing PII are data migrations and jupyter notebooks.

To run on migration files, clone the repo and pass a local path. For example:

uv run edc_detect_pii.py \
    --repo=/migrations \
    --exclude OTHER ABNORMAL NORMAL \
    --ext=py

To run on a jupyter notebook, pass a local path to a folder with notebooks

uv run edc_detect_pii.py \
    --path=/my_notebooks \
    --exclude OTHER ABNORMAL NORMAL

todo

  • allow custom regex and additional regex as arguments

  • consider pre-commit hook that uses a config file of custom words to exclude

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edc_detect_pii-0.2.1.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edc_detect_pii-0.2.1-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file edc_detect_pii-0.2.1.tar.gz.

File metadata

  • Download URL: edc_detect_pii-0.2.1.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for edc_detect_pii-0.2.1.tar.gz
Algorithm Hash digest
SHA256 5edaebe20b3e2beb58176114ffc703f41c13de04d5df777a8255054bf3940188
MD5 7533916f30aa687364fc829c7de2f9bc
BLAKE2b-256 aa4fef4ce844ccede805e6a7b97c0a6bbb4a1a5a54200a1fcbb5b4a8ec7aec55

See more details on using hashes here.

File details

Details for the file edc_detect_pii-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for edc_detect_pii-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 70154142e6d9f5f8f472512c55832fa4d26cfe27b31449a414be6947145aa5de
MD5 f9b1f6b9a093431a0973c9e2d15dceba
BLAKE2b-256 05567952de36388a7c761b093d70af0ec74c7ff7d25d3afcbf666908149fae6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page