Skip to main content

Privacy infrastructure for AI applications handling Czech and European data

Project description

FastPII

Python PyPI License FastPII

Privacy infrastructure for AI applications handling Czech and European data

FastPII detects, validates, anonymizes, and protects sensitive data before it reaches LLMs, RAG systems, vector databases, AI agents, or third-party AI providers.

Built for AI-native applications. Designed for privacy-first architectures.

Quick Start · Privacy Modes · AI Use Cases · Benchmarks · Documentation


Why FastPII

Most PII tools are built for generic text processing. FastPII is built for AI workflows.

Modern applications increasingly send documents, prompts, support tickets, contracts, medical records, and business data directly into LLMs and AI systems. FastPII acts as the privacy layer between your data and your AI.

Without FastPII          With FastPII

Document                 Document
  ↓                        ↓
LLM                      FastPII
                           ↓
                         LLM

Sensitive data exposed   Sensitive data protected

Czech-Native Detection

Unlike generic PII tools, FastPII understands Czech identifiers and validation rules:

  • Rodné číslo · IČO · DIČ · Bank Accounts
  • Phone Numbers · Postal Codes (PSČ) · Addresses
  • Personal Names · Vehicle Plates · Dates of Birth · Email Addresses

Checksum Validation

FastPII validates identifiers instead of relying solely on pattern matching. Rodné číslo (Mod 11), IČO (weighted Mod 11), Czech bank accounts, and DIČ formats all require valid checksums — structurally invalid identifiers are rejected before classification.

Context-Aware Detection

Detection is not based on regex alone. Phone numbers require context words or a +420 prefix. Postal codes require address proximity. Dates of birth need birth-related keywords nearby. Addresses use component scoring. This significantly reduces false positives.

Built for AI Workflows

FastPII integrates directly into RAG pipelines, LangChain applications, MCP servers, AI agents, FastAPI applications, and enterprise AI systems.


Features

Detection Identify sensitive Czech and European data
Validation Validate identifiers using official checksum rules
Privacy Protection Four modes: anonymize, redact, mask, remove
Framework-Independent SDK Use as a standalone Python package
Integrations FastAPI, LangChain, MCP, CLI
Local First No cloud, no LLM, no external API calls required

Installation

pip install fastpii

Quick Start

from fastpii import PrivacyGuard

guard = PrivacyGuard(regions=["cz"])

text = "Jan Novák, RČ: 800101/1238, IČO: 25596641"
result = guard.detect(text)

for finding in result.findings:
    print(f"{finding.type}: {finding.value}")
    # name: Jan Novák
    # rodne_cislo: 8001011238
    # ico: 25596641

Privacy Modes

Anonymize — Replace with [REDACTED]

guard.anonymize("Jan Novák, RČ: 800101/1238")
# → "[REDACTED], RČ: [REDACTED]"

Redact — Replace with PII type label

guard.redact("Jan Novák, RČ: 800101/1238")
# → "[NAME], RČ: [RODNE_CISLO]"

Mask — Replace with asterisks

guard.mask("Jan Novák")
# → "*********"

Remove — Delete PII entirely

guard.remove("Jan Novák")
# → ""

AI Use Cases

Protect RAG Pipelines

safe_document = guard.anonymize(document)
embeddings = embed_model.embed(safe_document)

Protect LLM Prompts

safe_prompt = guard.anonymize(prompt)
response = llm.invoke(safe_prompt)

Protect MCP Tools

safe_input = guard.anonymize(user_input)
result = tool.execute(safe_input)

Supported Czech Entities

Entity Detection Method Checksum
Rodné číslo Mod 11 checksum + date validation
IČO Weighted Mod 11 checksum
DIČ Multi-format + IČO validation
Bank Account Two-part Mod 11 checksum
Postal Code Context-gated (PSČ label, city, address proximity)
Phone Number Context-gated (+420 prefix or context words)
Date of Birth Context-gated (birth keywords, intervening date blocking)
Address Component scoring (street + number + city + postal)
Name Czech name dictionary + gender classification
Email Czech TLD detection, markdown mailto handling
Vehicle Plate Regional code validation

Benchmarks

Evaluated on Czech-focused datasets containing contracts, medical records, business registries, support tickets, and adversarial false-positive scenarios.

v0.2.4 overall:

Metric Score
Precision 84.2%
Recall 80.0%
F1 82.1%

Per-detector:

Detector Precision Recall Notes
IČO 100% 100% Checksum-validated, no FPs
DIČ 100% 100% Multi-format detection
Email 100% 100% Markdown mailto handled
Date 100% 100% Non-birth dates detected separately
Phone 100% 100% Context or +420 prefix required
Vehicle Plate 100% 100% Regional code validation
Date of Birth 100% 86% Context-gated; rejects generic dates
Postal Code 100% 71% Subsumed by address in overlaps
Name 80% 100% Dict-matched; corporate name FPs
Address 71% 63% Component scoring; partial matches
Rodné číslo 67% 50% Invalid checksums correctly rejected
Bank Account 100% 0% Requires labeled context (v0.2.5)

Roadmap

Current — Core SDK, Czech Detectors, Validation Engine, CLI, FastAPI Integration, LangChain Integration

Next (v0.2.5) — Strict Mode, MCP Integration, RAG Middleware, Improved Address & Bank Account Detection

Future — FastPII Gateway, Policy Engine, Audit Logging, Enterprise Features, Additional European Regions


Documentation


Contributing

Contributions welcome! See Contributing Guide.


License

Apache 2.0 — See LICENSE for details.


Built for privacy-first AI applications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastpii-0.2.4.1.tar.gz (67.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastpii-0.2.4.1-py3-none-any.whl (61.9 kB view details)

Uploaded Python 3

File details

Details for the file fastpii-0.2.4.1.tar.gz.

File metadata

  • Download URL: fastpii-0.2.4.1.tar.gz
  • Upload date:
  • Size: 67.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for fastpii-0.2.4.1.tar.gz
Algorithm Hash digest
SHA256 d27507a91c83693b7c3c5a110dbe5281c680216969a90358125b8f7a1b4b1791
MD5 a57f71e4c57f9cf7cf84bd1bc35a52e6
BLAKE2b-256 a9e0e4d4b88223585ff5c76f2272fb6455b0f92a2bd5c569ac3e141ffdd33afa

See more details on using hashes here.

File details

Details for the file fastpii-0.2.4.1-py3-none-any.whl.

File metadata

  • Download URL: fastpii-0.2.4.1-py3-none-any.whl
  • Upload date:
  • Size: 61.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for fastpii-0.2.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 29200c30f961b850842f6d6f30f6f95415a7093e1b22edc6797c586890f98cfe
MD5 15647688f7eeb146f78898edd1fbc374
BLAKE2b-256 b024cc270185a77c6407405e71f389fe78592138c5c3faec4fcd39619ee15f7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page