Skip to main content

Pteredactyl is a tool for NER based redaction of personally identifiable information in text

Project description

PteRedactyl

PteRedactyl

PteRedactyl utilizes advanced natural language processing techniques to identify and anonymise personal information in clinical free text.

Developed by the Data & AI Research (DAIR) Unit at University Hospital Southampton NHSFT for use in clinical research, PteRedactyl wraps around swappable NER models to redact or hide PII in strings or DataFrames.

Features

  • Anonymisation of various entities such as names, locations, and phone numbers.
  • Support for processing both strings and pandas DataFrames.
  • Text highlighting for easy identification of anonymised sections.
  • Hide in plain site (HIPS) replacement

⚙️ Installation

Via PyPI

Execute:

pip install pteredactyl

Via GitHub (uv)

To install in development mode, we recommend using uv.

  1. Install uv from the Astral website, or install via PyPI with pip install uv

  2. Clone the PteRedactyl repo:

git clone https://github.com/SETT-Centre-Data-and-AI/pteredactyl.git
  1. Navigate to the repositry (cd ...\pteredactyl\) and execute:
uv sync --group dev

📚 Guides

🤝 Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

⚖️ License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. CC BY-NC 4.0

🧑‍🔬 Authors

Valediction was developed by Cai Davis and Michael George at University Hospital Southampton NHSFT's Data & AI Research Unit (DAIR) - part of the Southampton Emerging Therapies and Technology (SETT) Centre.

NHS UHS SETT Centre

"# PteRedactyl_development"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pteredactyl-1.1.0.tar.gz (23.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pteredactyl-1.1.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file pteredactyl-1.1.0.tar.gz.

File metadata

  • Download URL: pteredactyl-1.1.0.tar.gz
  • Upload date:
  • Size: 23.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for pteredactyl-1.1.0.tar.gz
Algorithm Hash digest
SHA256 e90fcc6303a066f708d65bb6cd485833b414a6f2fd37df631346c2f4939712c6
MD5 106d7dbb31052e3441416c17e18b6502
BLAKE2b-256 d9906e3dde189974b60a31acb35d2efbcd3ca93a10dbfee2cac11ca9c955400b

See more details on using hashes here.

File details

Details for the file pteredactyl-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: pteredactyl-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for pteredactyl-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 76df2d564fa3c326b1772f7759cf52375ebf283d839707a21921839559269e9c
MD5 3fa2347eff61097b10daf679d7302f16
BLAKE2b-256 3fad81512041f703f125d67f2dbc64924d4c1d783e410de652c8243255136ba3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page