Automatic prompt version control for LLM applications
Project description
promptvc · Python SDK
Automatic prompt version control for LLM applications.
Drop in two lines of code and every LLM call is captured, versioned, and observable from the PromptVC dashboard.
Table of contents
- How it works
- Installation
- Quick start
- Configuration
- Integrations
- Context & metadata
- Conversations
- Named prompt assets
- Custom spans (no instrumentor)
- PII redaction
- Testing
- Environment variables reference
How it works
PromptVC is built on OpenTelemetry and the OpenInference semantic conventions.
promptvc.configure_otel()registers a custom OTel span exporter that forwards LLM traces to the PromptVC ingest API.- An OpenInference instrumentor (one per framework) monkey-patches your LLM client and emits standardised OTel spans automatically — no manual instrumentation needed.
- The backend clusters spans into versioned prompt assets, tracks drift, and surfaces the diff view in the dashboard.
Installation
pip install promptvc-sdk
Requires Python ≥ 3.10.
Install the OpenInference instrumentor for the framework(s) you use:
# OpenAI
pip install openinference-instrumentation-openai
# Anthropic
pip install openinference-instrumentation-anthropic
# LiteLLM (covers 100+ providers)
pip install openinference-instrumentation-litellm
# LangChain / LangGraph
pip install openinference-instrumentation-langchain
# Google ADK
pip install openinference-instrumentation-google-adk
Optional — PII redaction:
pip install 'promptvc-sdk[privacy]'
python -m spacy download en_core_web_md # or your preferred model
Quick start
import promptvc
from openinference.instrumentation.openai import OpenAIInstrumentor
# 1. Wire up OTel → PromptVC
promptvc.configure_otel(
api_key="pvc_live_xxx", # or set PROMPTVC_API_KEY
service="my-app",
env="production",
)
# 2. Instrument your LLM client (call once at startup)
OpenAIInstrumentor().instrument()
# 3. Use your client as normal — calls are captured automatically
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
That's it. Every subsequent client.chat.completions.create call — in any file, any function — is captured without further changes.
Configuration
promptvc.configure_otel()
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
PROMPTVC_API_KEY env var |
Your PromptVC API key |
service |
str |
"default" |
Logical name for this application |
env |
str |
"development" |
"development" · "staging" · "production" |
backend_url |
str |
https://ingest.promptvc.io |
Override ingest endpoint |
debug |
bool |
False |
Print exporter activity to stderr |
Call configure_otel() before calling any instrumentor's .instrument().
Integrations
OpenAI
Works with openai.OpenAI, openai.AsyncOpenAI, and any OpenAI-compatible client (Azure OpenAI, OpenRouter, etc.).
import promptvc
from openinference.instrumentation.openai import OpenAIInstrumentor
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
OpenAIInstrumentor().instrument()
import openai
client = openai.OpenAI()
# Non-streaming
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a concise assistant."},
{"role": "user", "content": "What is a vector database?"},
],
)
print(response.choices[0].message.content)
# Streaming — fully supported
stream = client.chat.completions.create(
model="gpt-4o-mini",
stream=True,
messages=[{"role": "user", "content": "Tell me a joke."}],
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Async clients work identically — just use openai.AsyncOpenAI() and await.
Anthropic
Works with anthropic.Anthropic and anthropic.AsyncAnthropic.
import promptvc
from openinference.instrumentation.anthropic import AnthropicInstrumentor
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
AnthropicInstrumentor().instrument()
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Explain transformers briefly."}],
)
print(response.content[0].text)
LiteLLM
Instruments every litellm.completion / litellm.acompletion call, covering 100+ providers through a single integration.
import promptvc
from openinference.instrumentation.litellm import LiteLLMInstrumentor
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
LiteLLMInstrumentor().instrument()
import litellm
response = litellm.completion(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
LangChain
No callbacks needed — LangChainInstrumentor auto-patches every LangChain provider, chain type, and invocation pattern (invoke, stream, ainvoke, astream, batch).
import promptvc
from openinference.instrumentation.langchain import LangChainInstrumentor
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
LangChainInstrumentor().instrument()
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage
llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke([
SystemMessage(content="You are a concise assistant."),
HumanMessage(content="What is a binary search tree?"),
])
print(response.content)
LCEL chains are captured transparently:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
chain = (
ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("human", "{question}"),
])
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)
result = chain.invoke({"question": "What is RAG?"})
Google ADK
Instruments every model call made by an ADK agent. No callbacks needed.
import promptvc
from openinference.instrumentation.google_adk import GoogleADKInstrumentor
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
GoogleADKInstrumentor().instrument()
from google.adk.agents import Agent
root_agent = Agent(
name="my-agent",
model="gemini-2.0-flash",
instruction="You are a helpful assistant.",
)
Works with any ADK-supported model backend including LiteLLM-proxied models (gpt-4o-mini, claude-*, etc.).
Context & metadata
Attach arbitrary metadata to every LLM call made within a block — useful for user-level analytics, A/B testing, and multi-tenant tracing.
with promptvc.context(user_id="u_123", tier="pro", feature="chat"):
response = client.chat.completions.create(...)
Contexts nest — inner keys override outer keys for the same name:
with promptvc.context(user_id="u_123"):
with promptvc.context(feature="summarizer"):
response = client.chat.completions.create(...)
# captured with user_id="u_123", feature="summarizer"
Conversations
Group multi-turn calls under a shared conversation_id so the full dialogue is linked in the dashboard.
with promptvc.conversation() as conv_id:
r1 = client.chat.completions.create(...)
r2 = client.chat.completions.create(...)
# r1 and r2 share the same conversation_id
Pass an explicit ID to resume an existing conversation:
with promptvc.conversation(conversation_id="existing-id"):
...
Named prompt assets
Automatic call-site capture
By default, PromptVC walks the call stack on every LLM span and records:
- File — the source file that initiated the call
- Function — the enclosing Python function name
- Line — the exact line number
- Fingerprint — a stable hash of file + function + source text used for version tracking
Different callers of the same shared LLM wrapper appear as separate entries in the dashboard automatically, with no decorators required.
def generate_summary(text: str) -> str:
# Call site captured automatically
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarise: {text}"}],
)
return response.choices[0].message.content
Tip: Put LLM calls inside named Python functions rather than at module level so the call site shows a meaningful function name in the dashboard.
@promptvc.observe — explicit asset names
Give a prompt a stable, human-readable name in the dashboard:
@promptvc.observe(name="invoice-parser")
def parse_invoice(text: str) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": INVOICE_PROMPT},
{"role": "user", "content": text},
],
)
return response.choices[0].message.content
Version tracking
PromptVC automatically identifies prompt versions using your system prompt as the version signal. Two calls with the same system prompt — even with different user messages — are grouped under the same version. When you change the system prompt, a new version is created and the diff is surfaced in the dashboard.
Custom spans (no instrumentor)
If there is no OpenInference instrumentor for your framework or HTTP client, use promptvc.generation() — a clean context manager that handles all OTel span creation for you.
import promptvc
promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
SYSTEM_PROMPT = "You are a concise assistant."
USER_MESSAGE = "What is a hash table?"
with promptvc.generation(
model="gpt-4o-mini",
provider="openai",
system=SYSTEM_PROMPT,
user=USER_MESSAGE,
) as gen:
reply, prompt_tokens, completion_tokens = my_llm_client(SYSTEM_PROMPT, USER_MESSAGE)
gen.set_output(reply, input_tokens=prompt_tokens, output_tokens=completion_tokens)
set_output() records the response text and optional token counts. Call it before the with block exits. If you don't call it, the span is still closed cleanly — just without output attributes.
Multi-turn conversations
Pass the full message list via messages instead of the system/user shorthand:
with promptvc.generation(
model="claude-sonnet-4-5",
provider="anthropic",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": turn_1},
{"role": "assistant", "content": reply_1},
{"role": "user", "content": turn_2},
],
) as gen:
reply = my_client.call(...)
gen.set_output(reply)
Parameters
| Parameter | Type | Description |
|---|---|---|
model |
str |
Model identifier, e.g. "gpt-4o-mini" |
system |
str |
System prompt shorthand |
user |
str |
User message shorthand |
messages |
list[dict] |
Full message list — overrides system/user if provided |
provider |
str |
Provider name, e.g. "openai" / "anthropic" |
name |
str |
OTel span name (default "promptvc.generation") |
metadata |
dict |
Arbitrary key/value pairs attached as promptvc.* span attributes |
set_output() parameters
| Parameter | Type | Description |
|---|---|---|
text |
str |
The model's plain-text response |
input_tokens |
int |
Prompt token count (optional, for cost tracking) |
output_tokens |
int |
Completion token count (optional, for cost tracking) |
See examples/custom_span.py for a complete runnable example.
PII redaction
PromptVC can strip sensitive data from prompt and response content before it leaves your process — nothing is sent to the PromptVC backend in plain text. Redaction runs on the OTel export path using Microsoft Presidio and spaCy as the NLP engine.
Installation
pip install 'promptvc-sdk[privacy]'
python -m spacy download en_core_web_md # or en_core_web_sm for a smaller footprint
Enabling redaction
Pass redact_pii=True to configure_otel(). That's all that's required — a sensible set of entity types is included by default.
promptvc.configure_otel(
api_key="pvc_live_xxx",
service="my-app",
redact_pii=True,
)
Every <ENTITY_TYPE> placeholder replaces the original value in the span before it is serialised and POSTed to the ingest API. Your raw data never travels over the network.
Default entity types
| Entity | Example input | Placeholder |
|---|---|---|
PERSON |
Sarah Johnson | <PERSON> |
EMAIL_ADDRESS |
user@example.com | <EMAIL_ADDRESS> |
PHONE_NUMBER |
800-555-0199 | <PHONE_NUMBER> |
US_SSN |
123-45-6789 | <US_SSN> |
CREDIT_CARD |
4111 1111 1111 1111 | <CREDIT_CARD> |
IP_ADDRESS |
192.168.0.1 | <IP_ADDRESS> |
LOCATION |
42 Elm Street | <LOCATION> |
Customising entity types
Replace the default list entirely by passing pii_entities:
promptvc.configure_otel(
api_key="pvc_live_xxx",
redact_pii=True,
pii_entities=[
"CREDIT_CARD",
"US_SSN",
"US_BANK_NUMBER",
"IBAN_CODE",
"EMAIL_ADDRESS",
"PHONE_NUMBER",
"PERSON",
"LOCATION",
"IP_ADDRESS",
"URL",
"US_PASSPORT",
"US_DRIVER_LICENSE",
"MEDICAL_LICENSE",
"DATE_TIME",
],
)
Full list of supported entity types: Presidio supported entities.
Confidence threshold
Presidio assigns each detection a confidence score (0–1). Detections below pii_score_threshold are ignored. Lower values catch more — at the cost of more false positives.
promptvc.configure_otel(
api_key="pvc_live_xxx",
redact_pii=True,
pii_score_threshold=0.4, # default is 0.5
)
Custom regex patterns
Add extra patterns (e.g. internal IDs, account numbers) via redact_patterns. Each entry is a Python regex string; any match is replaced with <REDACTED>.
promptvc.configure_otel(
api_key="pvc_live_xxx",
redact_pii=True,
redact_patterns=[r"EMP-\d{6}", r"ACC-[A-Z0-9]{8}"],
)
Using a lighter spaCy model
en_core_web_md (the default) gives the best recall. Swap for en_core_web_sm if memory is constrained:
python -m spacy download en_core_web_sm
promptvc.configure_otel(
api_key="pvc_live_xxx",
redact_pii=True,
pii_spacy_model="en_core_web_sm",
)
Previewing redaction locally
Before sending any traffic, you can verify what will be redacted by calling redact_text directly:
from promptvc.privacy import redact_text
from promptvc.config import get_config
cfg = get_config()
raw = "Hi, my name is Sarah Johnson. Email: sarah.johnson@example.com"
redacted = redact_text(
text=raw,
entities=cfg.pii_entities,
language=cfg.pii_language,
threshold=cfg.pii_score_threshold,
extra_patterns=cfg.redact_patterns,
spacy_model=cfg.pii_spacy_model,
)
print(redacted)
# Hi, my name is <PERSON>. Email: <EMAIL_ADDRESS>
See examples/pii_redaction.py for a complete runnable example.
Testing
Disable the SDK entirely in test environments so no spans are exported:
PROMPTVC_DISABLED=1 pytest
Or in code:
import os
os.environ["PROMPTVC_DISABLED"] = "1"
import promptvc # configure_otel becomes a no-op
Run the SDK's own tests:
poetry run pytest # unit tests only
poetry run pytest -m integration # integration tests (requires API keys)
Environment variables reference
| Variable | configure_otel() param |
Description |
|---|---|---|
PROMPTVC_API_KEY |
api_key |
Your API key |
PROMPTVC_SERVICE |
service |
Service name |
PROMPTVC_ENV |
env |
Deployment environment |
PROMPTVC_BACKEND_URL |
backend_url |
Override ingest endpoint |
PROMPTVC_DEBUG |
debug |
Set to 1 to enable debug logging |
PROMPTVC_DISABLED |
— | Set to 1 to disable the SDK entirely |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptvc_sdk-1.0.0b5.tar.gz.
File metadata
- Download URL: promptvc_sdk-1.0.0b5.tar.gz
- Upload date:
- Size: 40.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3876e777fe05b176c6a3925ea724da4f1050950833b528b578356edbe9405ce
|
|
| MD5 |
de6c7043c60a2e65a213989b53d31f24
|
|
| BLAKE2b-256 |
0cc91e8010bc32c40fb85223ea3b38d268eee5f0626414bf41572ce6aba5f1f7
|
Provenance
The following attestation bundles were made for promptvc_sdk-1.0.0b5.tar.gz:
Publisher:
release.yml on promptvc/promptvc-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptvc_sdk-1.0.0b5.tar.gz -
Subject digest:
c3876e777fe05b176c6a3925ea724da4f1050950833b528b578356edbe9405ce - Sigstore transparency entry: 1569689341
- Sigstore integration time:
-
Permalink:
promptvc/promptvc-python@0b1760d37c37f3372ea9a43d58c0580893a7f69d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/promptvc
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0b1760d37c37f3372ea9a43d58c0580893a7f69d -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file promptvc_sdk-1.0.0b5-py3-none-any.whl.
File metadata
- Download URL: promptvc_sdk-1.0.0b5-py3-none-any.whl
- Upload date:
- Size: 44.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aabfe6bbda8464370c00ddfe1235117faff7b746ab223bfa049734c2e86fdb4a
|
|
| MD5 |
a73e0d9d613372edf6361af1a90e4063
|
|
| BLAKE2b-256 |
8f48e15d780cbf829c1ce19094438f6cb0b709d99a6d53ad3cfde3b316410459
|
Provenance
The following attestation bundles were made for promptvc_sdk-1.0.0b5-py3-none-any.whl:
Publisher:
release.yml on promptvc/promptvc-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptvc_sdk-1.0.0b5-py3-none-any.whl -
Subject digest:
aabfe6bbda8464370c00ddfe1235117faff7b746ab223bfa049734c2e86fdb4a - Sigstore transparency entry: 1569689494
- Sigstore integration time:
-
Permalink:
promptvc/promptvc-python@0b1760d37c37f3372ea9a43d58c0580893a7f69d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/promptvc
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0b1760d37c37f3372ea9a43d58c0580893a7f69d -
Trigger Event:
workflow_dispatch
-
Statement type: