Skip to main content

Pseudonymise patient identifiers and PII in text (and restore them) — a local, dependency-free pattern engine.

Project description

redacta

Pseudonymise patient identifiers and PII in text — and restore them. A local, dependency-free Python pattern engine.

pip install redacta
from redacta import redact, reinstate

redacted, report, token_map = redact(
    "Dear patient, NHS Number: 943 476 5919, tel 0113 278 4532."
)
# redacted -> "Dear patient, NHS Number: [NHS_NUMBER_1], tel [PHONE_1]."

original = reinstate(redacted, token_map)
# original -> "Dear patient, NHS Number: 943 476 5919, tel 0113 278 4532."

What it detects

Deterministic, checksum-validated patterns: NHS numbers (Modulus-11), UK National Insurance numbers, dates of birth (keyword-anchored; appointment dates left intact), UK postcodes, US SSNs and ZIP codes, hospital/MRN numbers, emails, and phone numbers. Same value → same token; a token_map lets you reverse it.

Scope: this library is the deterministic layer only. Names, postal addresses and identifying ages need contextual judgement and are not covered here — the Redacta agent skill and the MCP server add those via reasoning. Stdlib only, no network calls; review output before sharing.

CLI

redacta letter.txt                 # prints JSON: redacted_text, report, token_map
redacta letter.txt --text-only     # just the redacted text
redacta-reinstate redacted.txt --map token_map.json

License

MIT-0 (MIT No Attribution). Built by PharmaTools.AI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redacta-1.1.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redacta-1.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file redacta-1.1.0.tar.gz.

File metadata

  • Download URL: redacta-1.1.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for redacta-1.1.0.tar.gz
Algorithm Hash digest
SHA256 11d270c4b5d53a4e9ce3b615745c5e7598bac049a499559ba4872cc4a7f024ed
MD5 c49a7bc17e936e9292c8a7ab6fe28b8d
BLAKE2b-256 244f4df2432971d80d767838d7924a33ad33fb481f1cdb68b46e17324bca5875

See more details on using hashes here.

File details

Details for the file redacta-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: redacta-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for redacta-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 60d169416e8666771e51ec220426d7d7dc39964f3ebffcf5cddefce810af03a2
MD5 46ab57651bc20f4d75ca43feea0f204d
BLAKE2b-256 2d09693b84db83a1648667e3375a243ab76eaad8425a9532d8971dd5942c7e70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page