Use Spacy NER models to clean personally identifiable information from dirty dirty text.
Project description
scrubadub removes personally identifiable information from text. scrubadub_spacy is an extension that uses spaCy NLP models to remove personal information from text.
This package contains two extra detectors:
scrubadub_spacy.detectors.SpacyEnityDetector - A detector that uses the spacy NER model to find locations, names, dates and other entities.
scrubadub_spacy.detectors.SpacyNameDetector - A detector that uses the spacy NER model and context words to find names in text.
For more information on how to use this package see the scrubadub spacy documentation and the scrubadub repository.
New maintainers
LeapBeyond are excited to be supporting scrubadub with ongoing maintenance and development. Thanks to all of the contributors who made this package a success, but especially @deanmalmgren, IDEO and Datascope.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrubadub_spacy-2.0.0.tar.gz.
File metadata
- Download URL: scrubadub_spacy-2.0.0.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
586071a2889446c2a7331f93eeaeb1e504468deee1313f4f9ec6029441ef5928
|
|
| MD5 |
959a0d8ed976dfe3e13204378a79943b
|
|
| BLAKE2b-256 |
fffd431c9cad14075f01a1f2ca007fdc71d76bc37e796d842a9d41200215eb35
|
File details
Details for the file scrubadub_spacy-2.0.0-py3-none-any.whl.
File metadata
- Download URL: scrubadub_spacy-2.0.0-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb50e2a5f6334b0c4c2d241e25001169a9f696e77cd3c4b8b5c5ea8bc6eb31d2
|
|
| MD5 |
d066d05b3e4b5998984ea6ae3f0602a2
|
|
| BLAKE2b-256 |
2939e2ce91d8236f770ea556bff9c98c3f25d4e49e5457d214d14e05a226a797
|