Privacy infrastructure for AI applications handling Czech and European data
Project description
FastPII
Privacy infrastructure for AI applications handling Czech and European data
FastPII detects, validates, anonymizes, and protects sensitive data before it reaches LLMs, RAG systems, vector databases, AI agents, or third-party AI providers.
Built for AI-native applications. Designed for privacy-first architectures.
Quick Start · Privacy Modes · AI Use Cases · Benchmarks · Documentation
Why FastPII
Most PII tools are built for generic text processing. FastPII is built for AI workflows.
Modern applications increasingly send documents, prompts, support tickets, contracts, medical records, and business data directly into LLMs and AI systems. FastPII acts as the privacy layer between your data and your AI.
Without FastPII With FastPII
Document Document
↓ ↓
LLM FastPII
↓
LLM
Sensitive data exposed Sensitive data protected
Czech-Native Detection
Unlike generic PII tools, FastPII understands Czech identifiers and validation rules:
- Rodné číslo · IČO · DIČ · Bank Accounts
- Phone Numbers · Postal Codes (PSČ) · Addresses
- Personal Names · Vehicle Plates · Dates of Birth · Email Addresses
Checksum Validation
FastPII validates identifiers instead of relying solely on pattern matching. Rodné číslo (Mod 11), IČO (weighted Mod 11), Czech bank accounts, and DIČ formats all require valid checksums — structurally invalid identifiers are rejected before classification.
Context-Aware Detection
Detection is not based on regex alone. Phone numbers require context words or a +420 prefix. Postal codes require address proximity. Dates of birth need birth-related keywords nearby. Addresses use component scoring. This significantly reduces false positives.
Built for AI Workflows
FastPII integrates directly into RAG pipelines, LangChain applications, MCP servers, AI agents, FastAPI applications, and enterprise AI systems.
Features
| Detection | Identify sensitive Czech and European data |
| Validation | Validate identifiers using official checksum rules |
| Privacy Protection | Four modes: anonymize, redact, mask, remove |
| Framework-Independent SDK | Use as a standalone Python package |
| Integrations | FastAPI, LangChain, MCP, CLI |
| Local First | No cloud, no LLM, no external API calls required |
Installation
pip install fastpii
Quick Start
from fastpii import PrivacyGuard
guard = PrivacyGuard(regions=["cz"])
text = "Jan Novák, RČ: 800101/1238, IČO: 25596641"
result = guard.detect(text)
for finding in result.findings:
print(f"{finding.type}: {finding.value}")
# name: Jan Novák
# rodne_cislo: 8001011238
# ico: 25596641
Privacy Modes
Anonymize — Replace with [REDACTED]
guard.anonymize("Jan Novák, RČ: 800101/1238")
# → "[REDACTED], RČ: [REDACTED]"
Redact — Replace with PII type label
guard.redact("Jan Novák, RČ: 800101/1238")
# → "[NAME], RČ: [RODNE_CISLO]"
Mask — Replace with asterisks
guard.mask("Jan Novák")
# → "*********"
Remove — Delete PII entirely
guard.remove("Jan Novák")
# → ""
AI Use Cases
Protect RAG Pipelines
safe_document = guard.anonymize(document)
embeddings = embed_model.embed(safe_document)
Protect LLM Prompts
safe_prompt = guard.anonymize(prompt)
response = llm.invoke(safe_prompt)
Protect MCP Tools
safe_input = guard.anonymize(user_input)
result = tool.execute(safe_input)
Supported Czech Entities
| Entity | Detection Method | Checksum |
|---|---|---|
| Rodné číslo | Mod 11 checksum + date validation | ✓ |
| IČO | Weighted Mod 11 checksum | ✓ |
| DIČ | Multi-format + IČO validation | ✓ |
| Bank Account | Two-part Mod 11 checksum | ✓ |
| Postal Code | Context-gated (PSČ label, city, address proximity) | — |
| Phone Number | Context-gated (+420 prefix or context words) | — |
| Date of Birth | Context-gated (birth keywords, intervening date blocking) | — |
| Address | Component scoring (street + number + city + postal) | — |
| Name | Czech name dictionary + gender classification | — |
| Czech TLD detection, markdown mailto handling | — | |
| Vehicle Plate | Regional code validation | — |
Benchmarks
Evaluated on Czech-focused datasets containing contracts, medical records, business registries, support tickets, and adversarial false-positive scenarios.
v0.2.4 overall:
| Metric | Score |
|---|---|
| Precision | 84.2% |
| Recall | 80.0% |
| F1 | 82.1% |
Per-detector:
| Detector | Precision | Recall | Notes |
|---|---|---|---|
| IČO | 100% | 100% | Checksum-validated, no FPs |
| DIČ | 100% | 100% | Multi-format detection |
| 100% | 100% | Markdown mailto handled | |
| Date | 100% | 100% | Non-birth dates detected separately |
| Phone | 100% | 100% | Context or +420 prefix required |
| Vehicle Plate | 100% | 100% | Regional code validation |
| Date of Birth | 100% | 86% | Context-gated; rejects generic dates |
| Postal Code | 100% | 71% | Subsumed by address in overlaps |
| Name | 80% | 100% | Dict-matched; corporate name FPs |
| Address | 71% | 63% | Component scoring; partial matches |
| Rodné číslo | 67% | 50% | Invalid checksums correctly rejected |
| Bank Account | 100% | 0% | Requires labeled context (v0.2.5) |
Roadmap
Current — Core SDK, Czech Detectors, Validation Engine, CLI, FastAPI Integration, LangChain Integration
Next (v0.2.5) — Strict Mode, MCP Integration, RAG Middleware, Improved Address & Bank Account Detection
Future — FastPII Gateway, Policy Engine, Audit Logging, Enterprise Features, Additional European Regions
Documentation
Contributing
Contributions welcome! See Contributing Guide.
License
Apache 2.0 — See LICENSE for details.
Built for privacy-first AI applications
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastpii-0.2.4.1.tar.gz.
File metadata
- Download URL: fastpii-0.2.4.1.tar.gz
- Upload date:
- Size: 67.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d27507a91c83693b7c3c5a110dbe5281c680216969a90358125b8f7a1b4b1791
|
|
| MD5 |
a57f71e4c57f9cf7cf84bd1bc35a52e6
|
|
| BLAKE2b-256 |
a9e0e4d4b88223585ff5c76f2272fb6455b0f92a2bd5c569ac3e141ffdd33afa
|
File details
Details for the file fastpii-0.2.4.1-py3-none-any.whl.
File metadata
- Download URL: fastpii-0.2.4.1-py3-none-any.whl
- Upload date:
- Size: 61.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29200c30f961b850842f6d6f30f6f95415a7093e1b22edc6797c586890f98cfe
|
|
| MD5 |
15647688f7eeb146f78898edd1fbc374
|
|
| BLAKE2b-256 |
b024cc270185a77c6407405e71f389fe78592138c5c3faec4fcd39619ee15f7e
|