Skip to main content

Fast PII detection for Czech and Central European identifiers

Project description

FastPII

Python License FastPII

Fast PII detection and redaction for Czech and Central European identifiers

Leveraging the FastAPI ecosystem for modern Python PII protection

Quick StartRedaction APIDetectorsDocumentationIntegrations


Why FastPII?

Performance meets Accuracy:

PII Type FastPII Microsoft Presidio AWS Macie Google DLP
Rodné číslo (CZ) >95% 22.7% 18.4% 15.9%
IČO (CZ) >99% 45.3% 38.7% 41.2%
DIČ (CZ) >98% 31.2% 24.6% 28.8%

Why the difference?

  • Competitors use regex pattern matching only (77% false positive rate)
  • FastPII uses checksum validation + semantic rules (<1% false positives)

Features

  • Region-specific detection (Czech Republic foundation)
  • Checksum validation for all identifiers
  • 4 redaction modes: anonymize, redact, mask, remove
  • Framework-independent core SDK
  • FastAPI integration
  • LangChain integration (LLM-ready)
  • MCP server (Claude Desktop)
  • CLI tool (fastpii detect)
  • Zero dependencies in core

Installation

pip install fastpii

Quick Start

from fastpii import PrivacyGuard

# Initialize
guard = PrivacyGuard(regions=["cz"])

# Detect PII
text = "Jan Novák, RČ: 8001011238, IČO: 25596641"
result = guard.detect(text)

for finding in result.findings:
    print(f"{finding.type}: {finding.value}")
    print(f"  Confidence: {finding.confidence:.1%}")
    print(f"  Position: {finding.start}-{finding.end}")
    if finding.metadata:
        print(f"  Metadata: {finding.metadata}")

# Validate specific identifiers
validation = guard.validate("8001011238", "rodne_cislo")
print(f"Valid: {validation.is_valid}")
if validation.metadata:
    print(f"Gender: {validation.metadata.get('gender')}")
    print(f"Birth date: {validation.metadata.get('birth_date')}")

Redaction API

FastPII provides four redaction modes to handle detected PII:

Anonymize — Replace with a placeholder

guard = PrivacyGuard(regions=["cz"])
guard.anonymize("Email: jan@email.cz, RČ: 8001011238")
# → "Email: [REDACTED], RČ: [REDACTED]"

# Custom placeholder
guard.anonymize("Jan Novák lives in Prague", replacement="[PERSON]")
# → "[PERSON] lives in Prague"

Redact — Replace with PII type label

guard.redact("Email: jan@email.cz, RČ: 8001011238")
# → "Email: [EMAIL], RČ: [RODNE_CISLO]"

Mask — Replace with asterisks matching original length

guard.mask("Email: jan@email.cz")
# → "Email: *************"

Remove — Delete PII entirely

guard.remove("Email: jan@email.cz")
# → "Email: "

All redaction methods use position-based replacement (sorted by position descending) to maintain correct character indices when multiple PII items overlap.

Czech Identifiers

Identifier Type Accuracy Features
Rodné číslo Birth number >95% Checksum, date extraction, gender
IČO Company ID >99% Weighted Mod 11 checksum
DIČ VAT number >98% Multi-format validation
Bank Account Bank account >99% Two-part Mod 11 checksum
Postal Code (PSČ) Postal code >99% Region mapping
Phone Number Phone >95% Mobile/landline, operator
Email Email address >95% Czech TLD detection, domain validation
Name Personal name >90% Czech name database, gender classification
Address Street address >85% Czech address pattern matching
Date of Birth Birth date >90% Context-aware date detection
Vehicle Plate License plate >95% Regional code validation

Integrations

FastAPI

from fastapi import FastAPI
from fastpii.integrations.fastapi import create_app

app = create_app()
# Run: uvicorn fastpii.integrations.fastapi:app --reload

LangChain

from fastpii.integrations.langchain import PIIAnonymizer

anonymizer = PIIAnonymizer(regions=["cz"])
safe_text = anonymizer("Jan Novák, RČ: 8001011238")
# Output: "Jan Novák, [REDACTED]"

# Redaction modes available
result = anonymizer.anonymize("Email: jan@email.cz")
result = anonymizer.redact("RČ: 8001011238")
result = anonymizer.mask("IČO: 25596641")
result = anonymizer.remove("Phone: +420 777 123 456")

CLI

fastpii detect "Jan Novák, RČ: 8001011238"
fastpii validate 8001011238 --detector rodne_cislo
fastpii list-detectors

Documentation

Contributing

Contributions welcome! See Contributing Guide.

License

Apache 2.0 - See LICENSE for details.


Built for the FastAPI ecosystem

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastpii-0.2.0.tar.gz (59.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastpii-0.2.0-py3-none-any.whl (55.1 kB view details)

Uploaded Python 3

File details

Details for the file fastpii-0.2.0.tar.gz.

File metadata

  • Download URL: fastpii-0.2.0.tar.gz
  • Upload date:
  • Size: 59.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for fastpii-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1b167035afb77196e5357ef82574740fbd6ac4a3a2cdea3bf858571c598759ae
MD5 f211111a853f75220659b66926e11ebc
BLAKE2b-256 8c505187a47443b75e101b36302dec3c542ddb71113108447b12a3a273274e0f

See more details on using hashes here.

File details

Details for the file fastpii-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: fastpii-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 55.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for fastpii-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 16e19b269996ecb3c04903d53b773c0702fcffc8677288993d8337a08a6a0a8f
MD5 b0d48bd6c3e7847436d2bd9abad7bb9c
BLAKE2b-256 0a59ecbb431d059a2b787ec1651941262774b4feedadd3f0c11a994545267f0d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page