Skip to main content

Zink lets you safeguard privacy by detecting sensitive information and replacing it with secure, customizable placeholders.

Project description

ZINK (Zero-shot Ink)

ZINK is a Python package designed for zero-shot anonymization of entities within unstructured text data. It allows you to redact or replace sensitive information based on specified entity labels.

Description

In today's data-driven world, protecting sensitive information is paramount. ZINK provides a simple and effective solution for anonymizing text data by identifying and masking entities such as names, ages, phone numbers, medical conditions, and more. With ZINK, you can ensure data privacy while still maintaining the utility of your text data for analysis and processing.

ZINK leverages the power of zero-shot techniques, meaning it doesn't require prior training on specific datasets. You simply provide the text and the entity labels you want to anonymize, and ZINK handles the rest.

Features

  • Zero-shot anonymization: No training data or pre-trained models required.
  • Flexible entity labeling: Anonymize any type of entity by specifying custom labels.
  • Redaction and replacement: Choose between redacting entities (replacing them with [LABEL]_REDACTED) or replacing them with a generic placeholder.
  • Easy integration: Simple and intuitive API for seamless integration into your Python projects.

Installation

pip install zink

Usage

Redacting Entities

The redact function replaces identified entities with [LABEL]_REDACTED.

Python

import zink as pss

text = "John works as a doctor and plays football after work and drives a toyota."
labels = ("person", "profession", "sport", "car")
result = pss.redact(text, labels)
print(result.anonymized_text)
Example output:

person_REDACTED works as a profession_REDACTED and plays sport_REDACTED after work and drives a car_REDACTED.

Replacing Entities

The replace function replaces identified entities with a random entity of the same type.

Python

import zink as pss

text = "John Doe dialled his mother at 992-234-3456 and then went out for a walk."
labels = ("person", "phone number", "relationship")
result = pss.replace(text, labels)
print(result.anonymized_text)
Example output:

Warren Buffet dialled his Uncle at 2347789287 and then went out for a walk.

Another example:

Python

import zink as pss

text = "Patient, 33 years old, was admitted with a chest pain"
labels = ("age", "medical condition")
result = pss.replace(text, labels)
print(result.anonymized_text)
Example output:

Patient, 78 years old, was admitted with a Diabetes Mellitus.

Testing

To run the tests, navigate to the project directory and execute:

pytest

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues to suggest improvements or report bugs.  

Fork the repository. Create a new branch: git checkout -b feature/your-feature Make your changes. Commit your changes: git commit -m 'Add your feature' Push to the branch: git push origin feature/your-feature Submit a pull request. License This project is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zink-0.1.2.tar.gz (24.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zink-0.1.2-py3-none-any.whl (24.3 MB view details)

Uploaded Python 3

File details

Details for the file zink-0.1.2.tar.gz.

File metadata

  • Download URL: zink-0.1.2.tar.gz
  • Upload date:
  • Size: 24.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.5

File hashes

Hashes for zink-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6d2380df0b0d3fc344bdc974b1ba11236aa19bf94fb4e32136e79cdc3758bb18
MD5 4c0733f79c994167f6109b90f4aa231e
BLAKE2b-256 e7248c3b222ef1e9405d96ec95d6acff9013cf7ed28ffceabc82c10ccf4855e4

See more details on using hashes here.

File details

Details for the file zink-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: zink-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 24.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.5

File hashes

Hashes for zink-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 46e29bc1201e17ba09cc4056a9e5dcf4e1bc0775c5a0ea2673357d93527a3122
MD5 2ee40520c9a314986c3ed09759475ec2
BLAKE2b-256 14fb34e696873534fa6fab56efe83183de2cb35adec9811310b4aa1e3dfb0513

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page