Self-hosted PII redaction step for LLM observability pipelines.
Project description
any-lang-anonymizer
Drop-in PII redaction for teams that already use Langfuse.
pip install any-lang-anonymizer
from langfuse import Langfuse
from pii_redactor.integrations.langfuse import make_mask
langfuse = Langfuse(mask=make_mask())
That is the main API. make_mask() recursively scans every string Langfuse sends through the mask callback, including nested input, output, metadata, messages, tool calls, and custom fields.
How It Works
any-lang-anonymizer runs the Bards AI ONNX PII model locally. Raw observability data is redacted in your app before it is sent to Langfuse.
The model is downloaded from Hugging Face on first use and cached locally by huggingface-hub.
Optional model config:
export PII_MODEL_ID=bardsai/eu-pii-anonimization-multilang
export PII_MODEL_CACHE_DIR=/path/to/model-cache
You can narrow or exclude JSON paths if needed:
langfuse = Langfuse(
mask=make_mask(
include_paths=["input", "output", "metadata", "messages.*.content", "tool_calls.*.args"],
exclude_paths=["metadata.trace_id", "metadata.model", "usage"],
)
)
By default, no path config is needed. Everything text-like is scanned.
Model
The default model is bardsai/eu-pii-anonimization-multilang. The project is Apache-2.0 licensed, matching the model license. The model files are not bundled in this package.
Local Playground
The FastAPI playground is a demo tool, not part of the core library.
python3 -m pip install -e '.[server]'
python3 -m uvicorn playground.app:app --reload
Open:
http://127.0.0.1:8000
Examples
Runnable demos live in examples/.
python3 -m pip install -e '.[demo]'
PII_LLM_PROVIDER=demo python3 examples/langfuse_demo.py
For real LLM demo traces:
export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
export LANGFUSE_BASE_URL=https://cloud.langfuse.com
export GROQ_API_KEY=gsk_...
python3 examples/langfuse_txt_demo.py --allow-leaks
The long TXT demo uses Groq llama-3.1-8b-instant by default when GROQ_API_KEY is set. It asks the LLM to append | checked to each non-empty line, then verifies locally whether known raw values remain after masking.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file any_lang_anonymizer-2026.6.5.post1.tar.gz.
File metadata
- Download URL: any_lang_anonymizer-2026.6.5.post1.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e5daec974fafadbd1d317786579bc86d74e6bb9a24eb06054493c1d0e9ac18f
|
|
| MD5 |
239c77890a520a4db8f3a003a6ff6b0c
|
|
| BLAKE2b-256 |
8849475e07f153f59d3d080976a3165c4d7fc5127243b2a6beeb480d6547186f
|
File details
Details for the file any_lang_anonymizer-2026.6.5.post1-py3-none-any.whl.
File metadata
- Download URL: any_lang_anonymizer-2026.6.5.post1-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
355aa8661fd3aebf3500bf94cb23979fe7dcee215d4befa60d862549a54801bf
|
|
| MD5 |
8809bff7dcc0b31c2299e9ac9aa09e1b
|
|
| BLAKE2b-256 |
90f4d2fe303a84322e4384ba2b9d36487e1c71f8c852c1de286566370b172014
|