Skip to main content

This package is made to censor sensitive data in images and extract the contents. NER is planned for the future

Project description

AGL Anonymizer

AGL Anonymizer is a Django-based API that interacts with the AGL Anonymizer Pipeline to provide comprehensive image processing capabilities, specifically for anonymizing sensitive information using common German names, blurring, and OCR (Optical Character Recognition). This API is designed to facilitate the seamless integration of anonymization functionalities into various applications, ensuring privacy and compliance with data protection regulations.

Features of the pipeline

  • Text Detection and Anonymization: Leverages advanced OCR techniques to detect and anonymize text within images, safeguarding sensitive information.
  • Blurring Functionality: Includes customizable blurring options to obscure specific areas of an image, enhancing privacy.
  • Image Saving: Efficiently saves processed images in the desired format while maintaining high-quality output.
  • Extensive Format Support: Capable of handling various image and document formats for diverse applications.
  • Pdf,

Installation

To get started with AGL Anonymizer, follow these steps:

  1. Clone the Repository:

    git clone https://github.com/wg-lux/agl_anonymizer.git
    cd agl_anonymizer
    
  2. Set Up the Development Environment:

    nix develop
    
  3. Install Dependencies: Ensure you have all required dependencies installed. Refer to pypoetry.toml for a list of dependencies.

  4. Download the Text Detection Model: Download a text detection model, such as frozen_east_text_detection.pb, and place it in the appropriate directory.

Usage

To use the AGL Anonymizer API, follow these steps:

  1. Prepare Your Images: Place the images you want to process in a designated folder.

  2. Configure Settings: Adjust settings in the configuration file (if applicable) to suit your anonymizing and blurring needs.

  3. Run the Django Server:

    python manage.py runserver
    
  4. Make API Requests: Decide first, if validation of the anonymization is necessary. (only available with a running instance of agl-validator) If so, the validation flag needs to be set to true inside of the request.

    Use an API client like Postman, cURL, or the requests library in Python to interact with the AGL Anonymizer API. Example request using the requests library:

    import requests
    import os
    
    # Get the directory of the current script
    base_dir = os.path.dirname(os.path.abspath(__file__))
    
    # Define the path to the image file located in the 'requests_agl_anonymizer' folder
    image_path = os.path.join(base_dir, 'frame_0.jpg')
    
    # Ensure the file exists
    if not os.path.exists(image_path):
        raise FileNotFoundError(f"No such file or directory: '{image_path}'")
    
    # Define the URL of the Django API endpoint
    url = 'http://127.0.0.1:8000/process/'
    
    # Open the file in binary mode and send it as part of the multipart form-data payload
    with open(image_path, 'rb') as image_file:
        files = {
            'file': image_file,
        }
        data = {
            'title': 'Example Image',
        }
        response = requests.post(url, files=files, data=data)
    
    # Print the response from the server
    print(response.status_code)
    print(response.json())  # Assuming the server returns a JSON response
    

API Endpoints

  • /process/: Endpoint to upload images and receive anonymized results.

Modules

AGL Anonymizer API comprises several key modules:

  • OCR Module: Detects and extracts text from images.
  • Anonymizer Module: Applies anonymization techniques to identified sensitive text regions.
  • Blur Module: Provides functions to blur specific areas in the image.
  • Save Module: Handles the saving of processed images in a chosen format.

Models

UploadedFile

Represents an uploaded file with fields for the original file, upload date, and an optional description.

from django.db import models
from django.utils import timezone

class UploadedFile(models.Model):
    original_file = models.FileField(upload_to='uploads/original/')
    upload_date = models.DateTimeField(default=timezone.now)
    description = models.TextField(blank=True, null=True)
    
    def __str__(self):
        return self.original_file.name

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agl_anonymizer-0.1.1.tar.gz (89.8 MB view details)

Uploaded Source

Built Distribution

agl_anonymizer-0.1.1-py3-none-any.whl (89.8 MB view details)

Uploaded Python 3

File details

Details for the file agl_anonymizer-0.1.1.tar.gz.

File metadata

  • Download URL: agl_anonymizer-0.1.1.tar.gz
  • Upload date:
  • Size: 89.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.12

File hashes

Hashes for agl_anonymizer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8433c6c7c39a758584f293b778bd1d203cfb1d6a9e6989298f84abab854fa334
MD5 984fa81e76155f5aaaebf0e5454dd967
BLAKE2b-256 96c874378c6607a4a2956e0b25cd59054bcaa1e30f8120dd1f64e20bfc2f452f

See more details on using hashes here.

File details

Details for the file agl_anonymizer-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: agl_anonymizer-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 89.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.9 Linux/6.9.12

File hashes

Hashes for agl_anonymizer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 105b684912fcdc76c96586267f0fff00ef88e35f5f8d02f17fbaadc9995ddca3
MD5 602ba7c00e1a0f3cd43468550fef5d64
BLAKE2b-256 cd1a69aa5498db3b0833ccfff46c1af15b692afe6a672500e62f12e478e37637

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page