Fast PII detection for Czech and Central European identifiers
Project description
FastPII
Fast PII detection and redaction for Czech and Central European identifiers
Leveraging the FastAPI ecosystem for modern Python PII protection
Quick Start • Redaction API • Detectors • Documentation • Integrations
Why FastPII?
Performance meets Accuracy:
| PII Type | FastPII | Microsoft Presidio | AWS Macie | Google DLP |
|---|---|---|---|---|
| Rodné číslo (CZ) | >95% | 22.7% | 18.4% | 15.9% |
| IČO (CZ) | >99% | 45.3% | 38.7% | 41.2% |
| DIČ (CZ) | >98% | 31.2% | 24.6% | 28.8% |
Why the difference?
- Competitors use regex pattern matching only (77% false positive rate)
- FastPII uses checksum validation + semantic rules (<1% false positives)
Features
- Region-specific detection (Czech Republic foundation)
- Checksum validation for all identifiers
- 4 redaction modes: anonymize, redact, mask, remove
- Framework-independent core SDK
- FastAPI integration
- LangChain integration (LLM-ready)
- MCP server (Claude Desktop)
- CLI tool (fastpii detect)
- Zero dependencies in core
Installation
pip install fastpii
Quick Start
from fastpii import PrivacyGuard
# Initialize
guard = PrivacyGuard(regions=["cz"])
# Detect PII
text = "Jan Novák, RČ: 8001011238, IČO: 25596641"
result = guard.detect(text)
for finding in result.findings:
print(f"{finding.type}: {finding.value}")
print(f" Confidence: {finding.confidence:.1%}")
print(f" Position: {finding.start}-{finding.end}")
if finding.metadata:
print(f" Metadata: {finding.metadata}")
# Validate specific identifiers
validation = guard.validate("8001011238", "rodne_cislo")
print(f"Valid: {validation.is_valid}")
if validation.metadata:
print(f"Gender: {validation.metadata.get('gender')}")
print(f"Birth date: {validation.metadata.get('birth_date')}")
Redaction API
FastPII provides four redaction modes to handle detected PII:
Anonymize — Replace with a placeholder
guard = PrivacyGuard(regions=["cz"])
guard.anonymize("Email: jan@email.cz, RČ: 8001011238")
# → "Email: [REDACTED], RČ: [REDACTED]"
# Custom placeholder
guard.anonymize("Jan Novák lives in Prague", replacement="[PERSON]")
# → "[PERSON] lives in Prague"
Redact — Replace with PII type label
guard.redact("Email: jan@email.cz, RČ: 8001011238")
# → "Email: [EMAIL], RČ: [RODNE_CISLO]"
Mask — Replace with asterisks matching original length
guard.mask("Email: jan@email.cz")
# → "Email: *************"
Remove — Delete PII entirely
guard.remove("Email: jan@email.cz")
# → "Email: "
All redaction methods use position-based replacement (sorted by position descending) to maintain correct character indices when multiple PII items overlap.
Czech Identifiers
| Identifier | Type | Accuracy | Features |
|---|---|---|---|
| Rodné číslo | Birth number | >95% | Checksum, date extraction, gender |
| IČO | Company ID | >99% | Weighted Mod 11 checksum |
| DIČ | VAT number | >98% | Multi-format validation |
| Bank Account | Bank account | >99% | Two-part Mod 11 checksum |
| Postal Code (PSČ) | Postal code | >99% | Region mapping |
| Phone Number | Phone | >95% | Mobile/landline, operator |
| Email address | >95% | Czech TLD detection, domain validation | |
| Name | Personal name | >90% | Czech name database, gender classification |
| Address | Street address | >85% | Czech address pattern matching |
| Date of Birth | Birth date | >90% | Context-aware date detection |
| Vehicle Plate | License plate | >95% | Regional code validation |
Integrations
FastAPI
from fastapi import FastAPI
from fastpii.integrations.fastapi import create_app
app = create_app()
# Run: uvicorn fastpii.integrations.fastapi:app --reload
LangChain
from fastpii.integrations.langchain import PIIAnonymizer
anonymizer = PIIAnonymizer(regions=["cz"])
safe_text = anonymizer("Jan Novák, RČ: 8001011238")
# Output: "Jan Novák, [REDACTED]"
# Redaction modes available
result = anonymizer.anonymize("Email: jan@email.cz")
result = anonymizer.redact("RČ: 8001011238")
result = anonymizer.mask("IČO: 25596641")
result = anonymizer.remove("Phone: +420 777 123 456")
CLI
fastpii detect "Jan Novák, RČ: 8001011238"
fastpii validate 8001011238 --detector rodne_cislo
fastpii list-detectors
Documentation
- Installation - Setup guide
- Quick Start - 5-minute tutorial
- Detectors - All 11 detectors explained
- API Reference - Core SDK docs
- Usage Guide - Complete usage examples
- Integrations - FastAPI, LangChain, MCP, CLI
Contributing
Contributions welcome! See Contributing Guide.
License
Apache 2.0 - See LICENSE for details.
Built for the FastAPI ecosystem
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastpii-0.2.0.tar.gz.
File metadata
- Download URL: fastpii-0.2.0.tar.gz
- Upload date:
- Size: 59.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b167035afb77196e5357ef82574740fbd6ac4a3a2cdea3bf858571c598759ae
|
|
| MD5 |
f211111a853f75220659b66926e11ebc
|
|
| BLAKE2b-256 |
8c505187a47443b75e101b36302dec3c542ddb71113108447b12a3a273274e0f
|
File details
Details for the file fastpii-0.2.0-py3-none-any.whl.
File metadata
- Download URL: fastpii-0.2.0-py3-none-any.whl
- Upload date:
- Size: 55.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16e19b269996ecb3c04903d53b773c0702fcffc8677288993d8337a08a6a0a8f
|
|
| MD5 |
b0d48bd6c3e7847436d2bd9abad7bb9c
|
|
| BLAKE2b-256 |
0a59ecbb431d059a2b787ec1651941262774b4feedadd3f0c11a994545267f0d
|