Skip to main content

Presidio image redactor package

Project description

Presidio Image Redactor

Please notice, this package is still in alpha and not production ready.

Description

The Presidio Image Redactor is a Python based module for detecting and redacting PII text entities in images.

Deploy Presidio image redactor to Azure

Use the following button to deploy presidio image redactor to your Azure subscription.

Deploy to Azure

Image Redactor Design

Installation

Pre-requisites:

  • Install Tesseract OCR by following the instructions on how to install it for your operating system.

    For now, image redactor only supports version 4.0.0

As package:

To get started with Presidio-image-redactor, run the following:

pip install presidio-image-redactor

Once Installed, run the following command to download the default spacy model needed for Presidio Analyzer:

python -m spacy download en_core_web_lg

Getting started

The engine will receive 2 parameters:

  1. Image to redact.
  2. Color fill to redact with, by default color fill will be black. Can either be an int or tuple (0,0,0)
from PIL import Image
from presidio_image_redactor import ImageRedactorEngine

# Get the image to redact using PIL lib (pillow)
image = Image.open("ocr_text.png")

# Initialize the engine
engine = ImageRedactorEngine()

# Redact the image with pink color
redacted_image = engine.redact(image, (255, 192, 203))

# save the redacted image 
redacted_image.save("new_image.png")
# open the image for viewing
redacted_image.show()

As docker service:

In folder presidio/presidio-image-redactor run:

docker-compose up -d

HTTP API

redact

Receives an image and color fill (optional, default is black). Redact the image PII text and returns a new redacted image.

POST /redact

Payload:

Sent as multipart-form. Contains image file and data of the required color fill.

{
  "data": "{'color_fill':'0,0,0'}"
}

Result:

200 OK

curl example:

# use ocr_test.png as the image to redact, and 255 as the color fill. 
# out.png is the new redacted image received from the server.
curl -XPOST "http://localhost:3000/redact" -H "content-type: multipart/form-data" -F "image=@ocr_test.png" -F "data=\"{'color_fill':'255'}\"" > out.png

Python script example can be found under: /presidio/e2e-tests/tests/test_image_redactor.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

presidio_image_redactor-0.0.2-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file presidio_image_redactor-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: presidio_image_redactor-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8

File hashes

Hashes for presidio_image_redactor-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 14a27cefab0c943d9309a85a2b5848b81a65414e0b7c840f37250c21b7aec46b
MD5 28cdcd88f664199b0491c9275bb93332
BLAKE2b-256 e79152bc18b8f0a58c7ed1a0b74df1f820f380a7caa648858323a9e0250ab410

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page