Transform recognized PII instances in a document
Project description
pii-transform
This package takes a source document, a collection of detected PII instances, and transforms the document by replacing the PII instances in the document with a different representation.
The type of substitution done is defined by transformation policies.
Note: pii-transform
does not implement or use Transformer models for PII
purposes (for the extraction of PII Instances using Transformer models, see
pii-extract-plg-transformers or pii-extract-plg-presidio).
Command-line scripts
The package provides three console scripts:
pii-transform
loads a source document & a collection of already-detected PII, and produces a transformed document following the required policies.pii-process
is a full end-to-end script:- loads a document, from among the formats supported by
pii-preprocess
- detects PII instances, according to
pii-extract
and its installed plugins - transforms the detected PII instances (according to the indicated policy) and writes out the transformed documennt
- loads a document, from among the formats supported by
pii-process-jsonl
is also a full end-to-end script; this one reads `JSONL files and processes each line as a separate text buffer (possibly in different languages), producing a transformed JSONL document
end-to-end installation
Note that pii-process
& pii-process-jsonl
will need additional packages
to be installed:
pii-preprocess
(only when usingpii-process
)pii-extract-base
, together with any desired detection plugins, e.g.pii-extract-plg-regex
,pii-extract-plg-transformers
, and/orpii-extract-plg-presidio
pii-decide
This installation can be performed explicitly, choosing the packages & plugins
to install. There is also an automatic dependency installation, which
installs a default set of packages, by adding the e2e
qualifier upon
installation of this package, i.e.:
pip install pii-transforme2e
... and this will install pii_preprocess
, pii-extract-base
,
pii-extract-plg-regex
, pii-extract-plg-transformers
and pii-decide
Note that you will also need to install Pytorch, so that the models used by
the pii-extract-plg-transformers
package can run. See the transformers
plugin documentation for more information,
API
The same functionality provided by the command-line scripts can also be accessed via a Python API
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.