Lightweight taint tracking for LLM pipelines — label secrets at entry, block them at unsafe sinks

These details have not been verified by PyPI

Project links

Project description

llm-taint

Lightweight taint tracking for LLM pipelines.

Label secrets (API keys, tokens, passwords) at the point they enter your system. Any attempt to send a tainted value to an unsafe sink — logs, HTTP responses, tool outputs — raises an exception immediately, before the data ever leaves.

import os
from llm_taint import taint, check_sink, scrub

api_key = taint(os.environ["OPENAI_API_KEY"], label="openai_api_key")

# TaintedStr is a transparent str subclass — works everywhere str does
assert isinstance(api_key, str)
assert api_key == os.environ["OPENAI_API_KEY"]

# This raises TaintViolationError: secret 'openai_api_key' reached sink 'log'
check_sink(api_key, sink="log")

# Safe representation for logging
print(scrub(api_key))   # "[REDACTED:openai_api_key]"

Zero required dependencies. Pure Python stdlib.

Why this matters for LLM applications

LLM applications are uniquely exposed to secret leakage:

Tool outputs are injected directly into the model context — a tainted value in a tool result means the key is in the model's input window.
Error messages from failed API calls often contain the request headers, including auth tokens.
Logging in async agent loops is verbose by necessity; one f-string away from leaking a key.
Prompt injection attacks may try to exfiltrate secrets by causing them to appear in generated text.

Classical taint tracking from compiler security research, applied to the LLM stack.

Installation

pip install llm-taint

Usage

Labeling secrets

from llm_taint import taint

# At startup / config load — before any processing
openai_key  = taint(os.environ["OPENAI_API_KEY"],  label="openai_api_key")
db_password = taint(config["db_password"],          label="db_password")

Checking sinks

from llm_taint import check_sink

# Before logging any value that might be tainted
user_input = request.json["message"]
check_sink(user_input, sink="log")   # safe if untainted

# Before including values in tool results
check_sink(tool_output, sink="tool_result")  # raises if tainted

# Unsafe sinks (raise on tainted input):
#   "log", "http_response", "tool_result", "error_message", "websocket"

# Safe sinks (always allowed):
#   "llm_prompt", "vault", "encrypted"

Scrubbing for safe output

from llm_taint import scrub, scrub_dict

# Single value
logger.info("Using key: %s", scrub(api_key))  # "Using key: [REDACTED:openai_api_key]"

# Whole config dict — safe to log
safe_config = scrub_dict({"api_key": api_key, "model": "gpt-4"})
logger.debug("Config: %s", safe_config)

Automatic log scrubbing

Install the filter once at startup — all log records are scrubbed automatically from that point on:

from llm_taint.logger import install_taint_filter
install_taint_filter()  # call before any logging

import logging
logger = logging.getLogger("myapp")

api_key = taint("sk-abc123", label="openai_key")
logger.info("Using key: %s", api_key)
# Output: "Using key: [REDACTED:openai_key]"

Environment variable tainting

The POSIX problem: on Linux/macOS, os.environ stores bytes internally and strips the TaintedStr subclass on every read. Use taint_env_secrets + get_tainted_env to work around this:

import os
from llm_taint import taint_env_secrets, get_tainted_env

# Call once at startup
taint_env_secrets(dict(os.environ))

# Later — use get_tainted_env instead of os.environ for sensitive vars
key = get_tainted_env("OPENAI_API_KEY")
assert isinstance(key, TaintedStr)  # True, even on Linux/macOS

taint_env_secrets automatically taints 25+ common secret env var names (OpenAI, Anthropic, AWS, Stripe, database URLs, etc.). Add your own:

from llm_taint import add_secret_env_key
add_secret_env_key("MY_COMPANY_API_KEY")

Registering custom sinks

from llm_taint import add_safe_sink, add_unsafe_sink

add_unsafe_sink("kafka_topic")   # treat as unsafe
add_safe_sink("hsm_module")      # treat as safe

How it works

TaintedStr is a str subclass that carries a _taint_label attribute. It is transparent to all normal string operations — isinstance, equality, concatenation, formatting — but the label travels with it.

os.environ["API_KEY"] ──taint()──▶ TaintedStr("sk-...", label="api_key")
                                            │
                        ┌───────────────────┼───────────────────┐
                        ▼                   ▼                   ▼
                  safe sink             unsafe sink         scrub()
               (vault/encrypted)         (log/http)       "[REDACTED:...]"
                   ✓ allowed           ✗ TaintViolation

The POSIX env registry (_env_taint_registry) is an in-process dict that survives the os.environ bytes round-trip — it's the authoritative source for tainted env vars on Linux/macOS.

Built-in unsafe sinks

Sink	Rationale
`log`	Secrets must never appear in log files
`http_response`	Secrets must never be returned to callers
`tool_result`	Tool outputs are injected into model context
`error_message`	Error strings often end up in logs or responses
`websocket`	Streaming output to clients

Running tests

pip install llm-taint[dev]
pytest

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_taint-0.1.0.tar.gz (15.5 kB view details)

Uploaded Mar 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_taint-0.1.0-py3-none-any.whl (10.1 kB view details)

Uploaded Mar 7, 2026 Python 3

File details

Details for the file llm_taint-0.1.0.tar.gz.

File metadata

Download URL: llm_taint-0.1.0.tar.gz
Upload date: Mar 7, 2026
Size: 15.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.12.3 HTTPX/0.28.1

File hashes

Hashes for llm_taint-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bfd2aa99bd6b1e441c0b1e10d79ae94d78c03c84855a70bc6f51e9bb6c40a1aa`
MD5	`17070399c02c1ea953fcc1577cf8eebc`
BLAKE2b-256	`f057d4306f098ce5bb935acd8e8b76ecf0cd27562d5393384ab0287122e8b3cd`

See more details on using hashes here.

File details

Details for the file llm_taint-0.1.0-py3-none-any.whl.

File metadata

Download URL: llm_taint-0.1.0-py3-none-any.whl
Upload date: Mar 7, 2026
Size: 10.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.12.3 HTTPX/0.28.1

File hashes

Hashes for llm_taint-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`922705480d7365cb9c046289bd2e775c597d326fba16df16ed4742e2ad6fffdf`
MD5	`19c579606ed31299dcb40be3a9c7a782`
BLAKE2b-256	`8046cd5f6e391850de46f0d01ce85995b394f14692d673f2071b44caca19f797`

See more details on using hashes here.

llm-taint 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-taint

Why this matters for LLM applications

Installation

Usage

Labeling secrets

Checking sinks

Scrubbing for safe output

Automatic log scrubbing

Environment variable tainting

Registering custom sinks

How it works

Built-in unsafe sinks

Running tests

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes