A package to mask PII in text using transformers

These details have not been verified by PyPI

Project links

Homepage

Project description

ai4privacy python module 🛡️

A Python package for state-of-the-art PII detection and masking using advanced transformer models.

Features

Protect Mode: Anonymize text by replacing detected PII with placeholders.
Observe Mode: Get statistics and a detailed "privacy mask" of found PII without altering the original text.
Multiple Models: Built-in support for:
- English-specific detection.
- Multilingual detection.
- Categorical detection (e.g., GIVENNAME, EMAIL, CITY).
Tunable Sensitivity: An adjustable score_threshold to balance detection accuracy with false positives.
Verbose & Developer Modes: Rich outputs for detailed analysis and debugging.

Installation

pip install ai4privacy

Quick Start

The simplest way to use the library is to call the protect function, which masks PII with placeholders.

from ai4privacy import protect

text = "Email me at developers@ai4privacy.com or call me at +41763223001."
masked_text = protect(text)

print(masked_text)
# Expected Output: Email me at [PII_1] or call me at [PII_2]

Advanced Usage

Using Different Models

You can easily switch between models using the multilingual and classify_pii flags.

from ai4privacy import protect

text = "Je m'appelle Pierre et j'habite à Paris."

# Use the multilingual model for non-English text
masked_multilingual = protect(text, multilingual=True)
print(f"Multilingual: {masked_multilingual}")
# Expected Output: Multilingual: Je m'appelle [PII_1] et j'habite à [PII_2]

# Use the categorical model to see the PII types
details = protect(text, classify_pii=True, verbose=True)
print(f"Categorical Labels: {[r['label'] for r in details['replacements']]}")
# Expected Output: Categorical Labels: ['GIVENNAME', 'CITY']

Observe Mode

To analyze text without changing it, use observe(). It returns a dictionary containing statistics and the privacy_mask—a detailed list of all PII entities found.

from ai4privacy import observe
import json

text = "My name is Alice and I live in Berlin."
report = observe(text, classify_pii=True)

print(json.dumps(report, indent=2))

{
  "num_texts_processed": 1,
  "num_texts_with_pii": 1,
  "pii_entity_counts": {
    "GIVENNAME": 1,
    "CITY": 1
  },
  "total_pii_entities_found": 2,
  "privacy_mask": [
    {
      "label": "GIVENNAME",
      "start": 11,
      "end": 16,
      "activation": 0.98,
      "value": "Alice"
    },
    {
      "label": "CITY",
      "start": 30,
      "end": 36,
      "activation": 0.99,
      "value": "Berlin"
    }
  ]
}

Verbose and Developer Modes

Set verbose=True to get a dictionary containing the original text, masked text, and replacement details. For deep debugging, developer_verbose=True adds a token-by-token breakdown of the model's predictions.

from ai4privacy import protect

text = "Senden Sie es an Eva Schmidt."
details = protect(text, classify_pii=True, verbose=True)

print(details['replacements'])
# Expected Output: [{'label': 'GIVENNAME', 'start': 18, 'end': 22, ...}, {'label': 'SURNAME', 'start': 23, 'end': 30, ...}]

Adjusting Sensitivity

The score_threshold (default: 0.01) controls how confident the model must be to flag a token as PII.

A lower value increases sensitivity (finds more PII, but may have more false positives).
A higher value increases precision (detections are more likely correct, but may miss some PII).

from ai4privacy import protect

text = "Maybe this is a name, maybe not. Contact John."

# High precision (less likely to flag "Maybe")
masked_high_prec = protect(text, score_threshold=0.5) 
print(f"High Precision: {masked_high_prec}")
# Expected Output: High Precision: Maybe this is a name, maybe not. Contact [PII_1]

# High sensitivity (more likely to flag "Maybe" if the model is unsure)
masked_high_sens = protect(text, score_threshold=0.001)
print(f"High Sensitivity: {masked_high_sens}")

Disclaimer 📢

Ai4Privacy is trained on the world's largest open-source privacy dataset. For production use, please evaluate results carefully on your own datasets. For assistance, contact us at our website https://ai4privacy.com or email support@ai4privacy.com.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.0

Mar 14, 2026

0.4.0

Mar 2, 2026

This version

0.3.3

Jul 20, 2025

0.3.2

Jul 20, 2025

0.1.3

Mar 25, 2025

0.1.2

Mar 23, 2025

0.1.1

Mar 23, 2025

0.1.0

Mar 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai4privacy-0.3.3.tar.gz (9.3 kB view details)

Uploaded Jul 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai4privacy-0.3.3-py3-none-any.whl (9.4 kB view details)

Uploaded Jul 20, 2025 Python 3

File details

Details for the file ai4privacy-0.3.3.tar.gz.

File metadata

Download URL: ai4privacy-0.3.3.tar.gz
Upload date: Jul 20, 2025
Size: 9.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for ai4privacy-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`78207a6c877c6d3a2772daea47afafd93e34bd01b21a930c65b6f296124d4d17`
MD5	`a0184781d95f880ea3fa7a21e853a75e`
BLAKE2b-256	`5846dfdffc041d061ada267f939e42f71abc39b107c391cafa7982fedb234df8`

See more details on using hashes here.

File details

Details for the file ai4privacy-0.3.3-py3-none-any.whl.

File metadata

Download URL: ai4privacy-0.3.3-py3-none-any.whl
Upload date: Jul 20, 2025
Size: 9.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for ai4privacy-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ae7144e8adf580747f33b16753c8505363869cdfe6ea7ea0b43c7ceb8c788063`
MD5	`8ceb5172a39166923ab60b0a3a4843b8`
BLAKE2b-256	`afd2836f46d6295401336f1fba080766e2391cf359c449e2ba41a074f837c2c2`

See more details on using hashes here.

ai4privacy 0.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ai4privacy python module 🛡️

Features

Installation

Quick Start

Advanced Usage

Using Different Models

Observe Mode

Verbose and Developer Modes

Adjusting Sensitivity

Disclaimer 📢

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes