Skip to main content

Self-hosted PII redaction step for LLM observability pipelines.

Project description

any-lang-anonymizer

Drop-in PII redaction for teams that already use Langfuse.

pip install any-lang-anonymizer
from langfuse import Langfuse
from pii_redactor.integrations.langfuse import make_mask

langfuse = Langfuse(mask=make_mask())

That is the main API. make_mask() recursively scans every string Langfuse sends through the mask callback, including nested input, output, metadata, messages, tool calls, and custom fields.

How It Works

any-lang-anonymizer runs the Bards AI ONNX PII model locally. Raw observability data is redacted in your app before it is sent to Langfuse.

The model is downloaded from Hugging Face on first use and cached locally by huggingface-hub.

Optional model config:

export PII_MODEL_ID=bardsai/eu-pii-anonimization-multilang
export PII_MODEL_CACHE_DIR=/path/to/model-cache

You can narrow or exclude JSON paths if needed:

langfuse = Langfuse(
    mask=make_mask(
        include_paths=["input", "output", "metadata", "messages.*.content", "tool_calls.*.args"],
        exclude_paths=["metadata.trace_id", "metadata.model", "usage"],
    )
)

By default, no path config is needed. Everything text-like is scanned.

Model

The default model is bardsai/eu-pii-anonimization-multilang. The project is Apache-2.0 licensed, matching the model license. The model files are not bundled in this package.

Local Playground

The FastAPI playground is a demo tool, not part of the core library.

python3 -m pip install -e '.[server]'
python3 -m uvicorn playground.app:app --reload

Open:

http://127.0.0.1:8000

Examples

Runnable demos live in examples/.

python3 -m pip install -e '.[demo]'
PII_LLM_PROVIDER=demo python3 examples/langfuse_demo.py

For real LLM demo traces:

export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
export LANGFUSE_BASE_URL=https://cloud.langfuse.com
export GROQ_API_KEY=gsk_...
python3 examples/langfuse_txt_demo.py --allow-leaks

The long TXT demo uses Groq llama-3.1-8b-instant by default when GROQ_API_KEY is set. It asks the LLM to append | checked to each non-empty line, then verifies locally whether known raw values remain after masking.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

any_lang_anonymizer-2026.6.5.post1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

any_lang_anonymizer-2026.6.5.post1-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file any_lang_anonymizer-2026.6.5.post1.tar.gz.

File metadata

File hashes

Hashes for any_lang_anonymizer-2026.6.5.post1.tar.gz
Algorithm Hash digest
SHA256 1e5daec974fafadbd1d317786579bc86d74e6bb9a24eb06054493c1d0e9ac18f
MD5 239c77890a520a4db8f3a003a6ff6b0c
BLAKE2b-256 8849475e07f153f59d3d080976a3165c4d7fc5127243b2a6beeb480d6547186f

See more details on using hashes here.

File details

Details for the file any_lang_anonymizer-2026.6.5.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for any_lang_anonymizer-2026.6.5.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 355aa8661fd3aebf3500bf94cb23979fe7dcee215d4befa60d862549a54801bf
MD5 8809bff7dcc0b31c2299e9ac9aa09e1b
BLAKE2b-256 90f4d2fe303a84322e4384ba2b9d36487e1c71f8c852c1de286566370b172014

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page