This package is made to censor sensitive data in images and extract the contents. NER is planned for the future
Project description
AGL Anonymizer
AGL Anonymizer is a Django-based API that interacts with the AGL Anonymizer Pipeline to provide comprehensive image processing capabilities, specifically for anonymizing sensitive information using common German names, blurring, and OCR (Optical Character Recognition). This API is designed to facilitate the seamless integration of anonymization functionalities into various applications, ensuring privacy and compliance with data protection regulations.
Features of the pipeline
- Text Detection and Anonymization: Leverages advanced OCR techniques to detect and anonymize text within images, safeguarding sensitive information.
- Blurring Functionality: Includes customizable blurring options to obscure specific areas of an image, enhancing privacy.
- Image Saving: Efficiently saves processed images in the desired format while maintaining high-quality output.
- Extensive Format Support: Capable of handling various image and document formats for diverse applications.
- Pdf,
Installation
To get started with AGL Anonymizer, follow these steps:
-
Clone the Repository:
git clone https://github.com/wg-lux/agl_anonymizer.git cd agl_anonymizer
-
Set Up the Development Environment:
nix develop
-
Install Dependencies: Ensure you have all required dependencies installed. Refer to
pypoetry.toml
for a list of dependencies. -
Download the Text Detection Model: Download a text detection model, such as
frozen_east_text_detection.pb
, and place it in the appropriate directory.
Usage
To use the AGL Anonymizer API, follow these steps:
-
Prepare Your Images: Place the images you want to process in a designated folder.
-
Configure Settings: Adjust settings in the configuration file (if applicable) to suit your anonymizing and blurring needs.
-
Run the Django Server:
python manage.py runserver
-
Make API Requests: Decide first, if validation of the anonymization is necessary. (only available with a running instance of agl-validator) If so, the validation flag needs to be set to true inside of the request.
Use an API client like Postman, cURL, or the
requests
library in Python to interact with the AGL Anonymizer API. Example request using therequests
library:import requests import os # Get the directory of the current script base_dir = os.path.dirname(os.path.abspath(__file__)) # Define the path to the image file located in the 'requests_agl_anonymizer' folder image_path = os.path.join(base_dir, 'frame_0.jpg') # Ensure the file exists if not os.path.exists(image_path): raise FileNotFoundError(f"No such file or directory: '{image_path}'") # Define the URL of the Django API endpoint url = 'http://127.0.0.1:8000/process/' # Open the file in binary mode and send it as part of the multipart form-data payload with open(image_path, 'rb') as image_file: files = { 'file': image_file, } data = { 'title': 'Example Image', } response = requests.post(url, files=files, data=data) # Print the response from the server print(response.status_code) print(response.json()) # Assuming the server returns a JSON response
API Endpoints
- /process/: Endpoint to upload images and receive anonymized results.
Modules
AGL Anonymizer API comprises several key modules:
- OCR Module: Detects and extracts text from images.
- Anonymizer Module: Applies anonymization techniques to identified sensitive text regions.
- Blur Module: Provides functions to blur specific areas in the image.
- Save Module: Handles the saving of processed images in a chosen format.
Models
UploadedFile
Represents an uploaded file with fields for the original file, upload date, and an optional description.
from django.db import models
from django.utils import timezone
class UploadedFile(models.Model):
original_file = models.FileField(upload_to='uploads/original/')
upload_date = models.DateTimeField(default=timezone.now)
description = models.TextField(blank=True, null=True)
def __str__(self):
return self.original_file.name
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file agl_anonymizer-0.1.1.tar.gz
.
File metadata
- Download URL: agl_anonymizer-0.1.1.tar.gz
- Upload date:
- Size: 89.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8433c6c7c39a758584f293b778bd1d203cfb1d6a9e6989298f84abab854fa334 |
|
MD5 | 984fa81e76155f5aaaebf0e5454dd967 |
|
BLAKE2b-256 | 96c874378c6607a4a2956e0b25cd59054bcaa1e30f8120dd1f64e20bfc2f452f |
File details
Details for the file agl_anonymizer-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: agl_anonymizer-0.1.1-py3-none-any.whl
- Upload date:
- Size: 89.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 105b684912fcdc76c96586267f0fff00ef88e35f5f8d02f17fbaadc9995ddca3 |
|
MD5 | 602ba7c00e1a0f3cd43468550fef5d64 |
|
BLAKE2b-256 | cd1a69aa5498db3b0833ccfff46c1af15b692afe6a672500e62f12e478e37637 |