Skip to main content

Use Spacy NER models to clean personally identifiable information from dirty dirty text.

Project description

scrubadub removes personally identifiable information from text. scrubadub_spacy is an extension that uses spaCy NLP models to remove personal information from text.

This package contains two extra detectors:

  • scrubadub_spacy.detectors.SpacyEnityDetector - A detector that uses the spacy NER model to find locations, names, dates and other entities.

  • scrubadub_spacy.detectors.SpacyNameDetector - A detector that uses the spacy NER model and context words to find names in text.

For more information on how to use this package see the scrubadub spacy documentation and the scrubadub repository.

Build Status Version Downloads Test Coverage Documentation Status

New maintainers

LeapBeyond are excited to be supporting scrubadub with ongoing maintenance and development. Thanks to all of the contributors who made this package a success, but especially @deanmalmgren, IDEO and Datascope.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrubadub_spacy-2.0.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

scrubadub_spacy-2.0.0-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file scrubadub_spacy-2.0.0.tar.gz.

File metadata

  • Download URL: scrubadub_spacy-2.0.0.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for scrubadub_spacy-2.0.0.tar.gz
Algorithm Hash digest
SHA256 586071a2889446c2a7331f93eeaeb1e504468deee1313f4f9ec6029441ef5928
MD5 959a0d8ed976dfe3e13204378a79943b
BLAKE2b-256 fffd431c9cad14075f01a1f2ca007fdc71d76bc37e796d842a9d41200215eb35

See more details on using hashes here.

File details

Details for the file scrubadub_spacy-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: scrubadub_spacy-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for scrubadub_spacy-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eb50e2a5f6334b0c4c2d241e25001169a9f696e77cd3c4b8b5c5ea8bc6eb31d2
MD5 d066d05b3e4b5998984ea6ae3f0602a2
BLAKE2b-256 2939e2ce91d8236f770ea556bff9c98c3f25d4e49e5457d214d14e05a226a797

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page