Skip to main content

Self-hosted PII redaction step for LLM observability pipelines.

Project description

any-lang-anonymizer

Drop-in PII redaction for teams that already use Langfuse.

pip install any-lang-anonymizer
from langfuse import Langfuse
from pii_redactor.integrations.langfuse import make_mask

langfuse = Langfuse(mask=make_mask())

That is the main API. make_mask() recursively scans every string Langfuse sends through the mask callback, including nested input, output, metadata, messages, tool calls, and custom fields.

Supported Integrations

Framework Status How to use
Langfuse SDK Ready Langfuse(mask=make_mask())
LangChain / LangGraph Ready make_langfuse_callback() with config={"callbacks": [...]}
Pydantic AI Coming soon Planned example
OpenAI SDK Coming soon Planned example

How It Works

any-lang-anonymizer runs the Bards AI ONNX PII model locally. Raw observability data is redacted in your app before it is sent to Langfuse.

The model is downloaded from Hugging Face on first use and cached locally by huggingface-hub.

Optional model config:

export PII_MODEL_ID=bardsai/eu-pii-anonimization-multilang
export PII_MODEL_CACHE_DIR=/path/to/model-cache

You can narrow or exclude JSON paths if needed:

langfuse = Langfuse(
    mask=make_mask(
        include_paths=["input", "output", "metadata", "messages.*.content", "tool_calls.*.args"],
        exclude_paths=["metadata.trace_id", "metadata.model", "usage"],
    )
)

By default, no path config is needed. Everything text-like is scanned.

LangChain / LangGraph

Langfuse traces LangChain and LangGraph through a callback handler. Use the same callback, but create the Langfuse client with the anonymizer mask first.

pip install 'any-lang-anonymizer[langchain]'
from langchain.agents import create_agent
from pii_redactor.integrations.langchain import make_langfuse_callback

langfuse_handler = make_langfuse_callback()
agent = create_agent(model="groq:llama-3.1-8b-instant", tools=[])

agent.invoke(
    {"messages": [{"role": "user", "content": "Jan Kowalski, jan.kowalski@example.com"}]},
    config={"callbacks": [langfuse_handler]},
)

create_agent runs on LangGraph internally, so this is the shortest LangGraph-backed agent path. A runnable example is in examples/langchain_langgraph_langfuse.py.

Model

The default model is bardsai/eu-pii-anonimization-multilang. The project is Apache-2.0 licensed, matching the model license. The model files are not bundled in this package.

Local Playground

The FastAPI playground is a demo tool, not part of the core library.

python3 -m pip install -e '.[server]'
python3 -m uvicorn playground.app:app --reload

Open:

http://127.0.0.1:8000

Examples

Runnable demos live in examples/.

python3 -m pip install -e '.[demo]'
PII_LLM_PROVIDER=demo python3 examples/langfuse_demo.py

For real LLM demo traces:

export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
export LANGFUSE_BASE_URL=https://cloud.langfuse.com
export GROQ_API_KEY=gsk_...
python3 examples/langfuse_txt_demo.py --allow-leaks

The long TXT demo uses Groq llama-3.1-8b-instant by default when GROQ_API_KEY is set. It asks the LLM to append | checked to each non-empty line, then verifies locally whether known raw values remain after masking.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

any_lang_anonymizer-2026.6.8.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

any_lang_anonymizer-2026.6.8-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file any_lang_anonymizer-2026.6.8.tar.gz.

File metadata

  • Download URL: any_lang_anonymizer-2026.6.8.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for any_lang_anonymizer-2026.6.8.tar.gz
Algorithm Hash digest
SHA256 1bc51b4885e07d19b913f322c134d4e98ff994d2fffda1ef8ebc547ca0d18fb5
MD5 f122748dc170d9e748868c22ba22c66b
BLAKE2b-256 4736497bf584c69faa439e1632a344cd0e7a5916bb6c8c5d3ce51ecb8b04686e

See more details on using hashes here.

File details

Details for the file any_lang_anonymizer-2026.6.8-py3-none-any.whl.

File metadata

File hashes

Hashes for any_lang_anonymizer-2026.6.8-py3-none-any.whl
Algorithm Hash digest
SHA256 69f7cfcfe22a751f9e939894d4d8b37a86880802c1bb2643a5f8f26b0815c758
MD5 e33fe114bebecafbd1a9586bab5660a3
BLAKE2b-256 2fbcad48fb27aac493436d5030e5abd07a9e3291216a1028ec648f430d293cd3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page