Skip to main content

Automatic prompt version control for LLM applications

Project description

promptvc · Python SDK

Automatic prompt version control for LLM applications.
Drop in two lines of code and every LLM call is captured, versioned, and observable from the PromptVC dashboard.


Table of contents


How it works

PromptVC is built on OpenTelemetry and the OpenInference semantic conventions.

  1. promptvc.configure_otel() registers a custom OTel span exporter that forwards LLM traces to the PromptVC ingest API.
  2. An OpenInference instrumentor (one per framework) monkey-patches your LLM client and emits standardised OTel spans automatically — no manual instrumentation needed.
  3. The backend clusters spans into versioned prompt assets, tracks drift, and surfaces the diff view in the dashboard.

Installation

pip install promptvc-sdk

Requires Python ≥ 3.10.

Install the OpenInference instrumentor for the framework(s) you use:

# OpenAI
pip install openinference-instrumentation-openai

# Anthropic
pip install openinference-instrumentation-anthropic

# LiteLLM (covers 100+ providers)
pip install openinference-instrumentation-litellm

# LangChain / LangGraph
pip install openinference-instrumentation-langchain

# Google ADK
pip install openinference-instrumentation-google-adk

Optional — PII redaction:

pip install 'promptvc-sdk[privacy]'
python -m spacy download en_core_web_md   # or your preferred model

Quick start

import promptvc
from openinference.instrumentation.openai import OpenAIInstrumentor

# 1. Wire up OTel → PromptVC
promptvc.configure_otel(
    api_key="pvc_live_xxx",   # or set PROMPTVC_API_KEY
    service="my-app",
    env="production",
)

# 2. Instrument your LLM client (call once at startup)
OpenAIInstrumentor().instrument()

# 3. Use your client as normal — calls are captured automatically
import openai

client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

That's it. Every subsequent client.chat.completions.create call — in any file, any function — is captured without further changes.


Configuration

promptvc.configure_otel()

Parameter Type Default Description
api_key str PROMPTVC_API_KEY env var Your PromptVC API key
service str "default" Logical name for this application
env str "development" "development" · "staging" · "production"
backend_url str https://ingest.promptvc.io Override ingest endpoint
debug bool False Print exporter activity to stderr

Call configure_otel() before calling any instrumentor's .instrument().


Integrations

OpenAI

Works with openai.OpenAI, openai.AsyncOpenAI, and any OpenAI-compatible client (Azure OpenAI, OpenRouter, etc.).

import promptvc
from openinference.instrumentation.openai import OpenAIInstrumentor

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
OpenAIInstrumentor().instrument()

import openai

client = openai.OpenAI()

# Non-streaming
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "What is a vector database?"},
    ],
)
print(response.choices[0].message.content)

# Streaming — fully supported
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    stream=True,
    messages=[{"role": "user", "content": "Tell me a joke."}],
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Async clients work identically — just use openai.AsyncOpenAI() and await.


Anthropic

Works with anthropic.Anthropic and anthropic.AsyncAnthropic.

import promptvc
from openinference.instrumentation.anthropic import AnthropicInstrumentor

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
AnthropicInstrumentor().instrument()

import anthropic

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Explain transformers briefly."}],
)
print(response.content[0].text)

LiteLLM

Instruments every litellm.completion / litellm.acompletion call, covering 100+ providers through a single integration.

import promptvc
from openinference.instrumentation.litellm import LiteLLMInstrumentor

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
LiteLLMInstrumentor().instrument()

import litellm

response = litellm.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

LangChain

No callbacks needed — LangChainInstrumentor auto-patches every LangChain provider, chain type, and invocation pattern (invoke, stream, ainvoke, astream, batch).

import promptvc
from openinference.instrumentation.langchain import LangChainInstrumentor

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
LangChainInstrumentor().instrument()

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

llm = ChatOpenAI(model="gpt-4o-mini")
response = llm.invoke([
    SystemMessage(content="You are a concise assistant."),
    HumanMessage(content="What is a binary search tree?"),
])
print(response.content)

LCEL chains are captured transparently:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

chain = (
    ChatPromptTemplate.from_messages([
        ("system", "You are a helpful assistant."),
        ("human", "{question}"),
    ])
    | ChatOpenAI(model="gpt-4o-mini")
    | StrOutputParser()
)

result = chain.invoke({"question": "What is RAG?"})

Google ADK

Instruments every model call made by an ADK agent. No callbacks needed.

import promptvc
from openinference.instrumentation.google_adk import GoogleADKInstrumentor

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")
GoogleADKInstrumentor().instrument()

from google.adk.agents import Agent

root_agent = Agent(
    name="my-agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful assistant.",
)

Works with any ADK-supported model backend including LiteLLM-proxied models (gpt-4o-mini, claude-*, etc.).


Context & metadata

Attach arbitrary metadata to every LLM call made within a block — useful for user-level analytics, A/B testing, and multi-tenant tracing.

with promptvc.context(user_id="u_123", tier="pro", feature="chat"):
    response = client.chat.completions.create(...)

Contexts nest — inner keys override outer keys for the same name:

with promptvc.context(user_id="u_123"):
    with promptvc.context(feature="summarizer"):
        response = client.chat.completions.create(...)
        # captured with user_id="u_123", feature="summarizer"

Conversations

Group multi-turn calls under a shared conversation_id so the full dialogue is linked in the dashboard.

with promptvc.conversation() as conv_id:
    r1 = client.chat.completions.create(...)
    r2 = client.chat.completions.create(...)
    # r1 and r2 share the same conversation_id

Pass an explicit ID to resume an existing conversation:

with promptvc.conversation(conversation_id="existing-id"):
    ...

Named prompt assets

Automatic call-site capture

By default, PromptVC walks the call stack on every LLM span and records:

  • File — the source file that initiated the call
  • Function — the enclosing Python function name
  • Line — the exact line number
  • Fingerprint — a stable hash of file + function + source text used for version tracking

Different callers of the same shared LLM wrapper appear as separate entries in the dashboard automatically, with no decorators required.

def generate_summary(text: str) -> str:
    # Call site captured automatically
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Summarise: {text}"}],
    )
    return response.choices[0].message.content

Tip: Put LLM calls inside named Python functions rather than at module level so the call site shows a meaningful function name in the dashboard.

@promptvc.observe — explicit asset names

Give a prompt a stable, human-readable name in the dashboard:

@promptvc.observe(name="invoice-parser")
def parse_invoice(text: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": INVOICE_PROMPT},
            {"role": "user", "content": text},
        ],
    )
    return response.choices[0].message.content

Version tracking

PromptVC automatically identifies prompt versions using your system prompt as the version signal. Two calls with the same system prompt — even with different user messages — are grouped under the same version. When you change the system prompt, a new version is created and the diff is surfaced in the dashboard.


Custom spans (no instrumentor)

If there is no OpenInference instrumentor for your framework or HTTP client, use promptvc.generation() — a clean context manager that handles all OTel span creation for you.

import promptvc

promptvc.configure_otel(api_key="pvc_live_xxx", service="my-app")

SYSTEM_PROMPT = "You are a concise assistant."
USER_MESSAGE  = "What is a hash table?"

with promptvc.generation(
    model="gpt-4o-mini",
    provider="openai",
    system=SYSTEM_PROMPT,
    user=USER_MESSAGE,
) as gen:
    reply, prompt_tokens, completion_tokens = my_llm_client(SYSTEM_PROMPT, USER_MESSAGE)
    gen.set_output(reply, input_tokens=prompt_tokens, output_tokens=completion_tokens)

set_output() records the response text and optional token counts. Call it before the with block exits. If you don't call it, the span is still closed cleanly — just without output attributes.

Multi-turn conversations

Pass the full message list via messages instead of the system/user shorthand:

with promptvc.generation(
    model="claude-sonnet-4-5",
    provider="anthropic",
    messages=[
        {"role": "system",    "content": SYSTEM_PROMPT},
        {"role": "user",      "content": turn_1},
        {"role": "assistant", "content": reply_1},
        {"role": "user",      "content": turn_2},
    ],
) as gen:
    reply = my_client.call(...)
    gen.set_output(reply)

Parameters

Parameter Type Description
model str Model identifier, e.g. "gpt-4o-mini"
system str System prompt shorthand
user str User message shorthand
messages list[dict] Full message list — overrides system/user if provided
provider str Provider name, e.g. "openai" / "anthropic"
name str OTel span name (default "promptvc.generation")
metadata dict Arbitrary key/value pairs attached as promptvc.* span attributes

set_output() parameters

Parameter Type Description
text str The model's plain-text response
input_tokens int Prompt token count (optional, for cost tracking)
output_tokens int Completion token count (optional, for cost tracking)

See examples/custom_span.py for a complete runnable example.


PII redaction

PromptVC can strip sensitive data from prompt and response content before it leaves your process — nothing is sent to the PromptVC backend in plain text. Redaction runs on the OTel export path using Microsoft Presidio and spaCy as the NLP engine.

Installation

pip install 'promptvc-sdk[privacy]'
python -m spacy download en_core_web_md   # or en_core_web_sm for a smaller footprint

Enabling redaction

Pass redact_pii=True to configure_otel(). That's all that's required — a sensible set of entity types is included by default.

promptvc.configure_otel(
    api_key="pvc_live_xxx",
    service="my-app",
    redact_pii=True,
)

Every <ENTITY_TYPE> placeholder replaces the original value in the span before it is serialised and POSTed to the ingest API. Your raw data never travels over the network.

Default entity types

Entity Example input Placeholder
PERSON Sarah Johnson <PERSON>
EMAIL_ADDRESS user@example.com <EMAIL_ADDRESS>
PHONE_NUMBER 800-555-0199 <PHONE_NUMBER>
US_SSN 123-45-6789 <US_SSN>
CREDIT_CARD 4111 1111 1111 1111 <CREDIT_CARD>
IP_ADDRESS 192.168.0.1 <IP_ADDRESS>
LOCATION 42 Elm Street <LOCATION>

Customising entity types

Replace the default list entirely by passing pii_entities:

promptvc.configure_otel(
    api_key="pvc_live_xxx",
    redact_pii=True,
    pii_entities=[
        "CREDIT_CARD",
        "US_SSN",
        "US_BANK_NUMBER",
        "IBAN_CODE",
        "EMAIL_ADDRESS",
        "PHONE_NUMBER",
        "PERSON",
        "LOCATION",
        "IP_ADDRESS",
        "URL",
        "US_PASSPORT",
        "US_DRIVER_LICENSE",
        "MEDICAL_LICENSE",
        "DATE_TIME",
    ],
)

Full list of supported entity types: Presidio supported entities.

Confidence threshold

Presidio assigns each detection a confidence score (0–1). Detections below pii_score_threshold are ignored. Lower values catch more — at the cost of more false positives.

promptvc.configure_otel(
    api_key="pvc_live_xxx",
    redact_pii=True,
    pii_score_threshold=0.4,   # default is 0.5
)

Custom regex patterns

Add extra patterns (e.g. internal IDs, account numbers) via redact_patterns. Each entry is a Python regex string; any match is replaced with <REDACTED>.

promptvc.configure_otel(
    api_key="pvc_live_xxx",
    redact_pii=True,
    redact_patterns=[r"EMP-\d{6}", r"ACC-[A-Z0-9]{8}"],
)

Using a lighter spaCy model

en_core_web_md (the default) gives the best recall. Swap for en_core_web_sm if memory is constrained:

python -m spacy download en_core_web_sm
promptvc.configure_otel(
    api_key="pvc_live_xxx",
    redact_pii=True,
    pii_spacy_model="en_core_web_sm",
)

Previewing redaction locally

Before sending any traffic, you can verify what will be redacted by calling redact_text directly:

from promptvc.privacy import redact_text
from promptvc.config import get_config

cfg = get_config()
raw = "Hi, my name is Sarah Johnson. Email: sarah.johnson@example.com"
redacted = redact_text(
    text=raw,
    entities=cfg.pii_entities,
    language=cfg.pii_language,
    threshold=cfg.pii_score_threshold,
    extra_patterns=cfg.redact_patterns,
    spacy_model=cfg.pii_spacy_model,
)
print(redacted)
# Hi, my name is <PERSON>. Email: <EMAIL_ADDRESS>

See examples/pii_redaction.py for a complete runnable example.


Testing

Disable the SDK entirely in test environments so no spans are exported:

PROMPTVC_DISABLED=1 pytest

Or in code:

import os
os.environ["PROMPTVC_DISABLED"] = "1"
import promptvc  # configure_otel becomes a no-op

Run the SDK's own tests:

poetry run pytest                   # unit tests only
poetry run pytest -m integration    # integration tests (requires API keys)

Environment variables reference

Variable configure_otel() param Description
PROMPTVC_API_KEY api_key Your API key
PROMPTVC_SERVICE service Service name
PROMPTVC_ENV env Deployment environment
PROMPTVC_BACKEND_URL backend_url Override ingest endpoint
PROMPTVC_DEBUG debug Set to 1 to enable debug logging
PROMPTVC_DISABLED Set to 1 to disable the SDK entirely

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptvc_sdk-1.0.0b5.tar.gz (40.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptvc_sdk-1.0.0b5-py3-none-any.whl (44.4 kB view details)

Uploaded Python 3

File details

Details for the file promptvc_sdk-1.0.0b5.tar.gz.

File metadata

  • Download URL: promptvc_sdk-1.0.0b5.tar.gz
  • Upload date:
  • Size: 40.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptvc_sdk-1.0.0b5.tar.gz
Algorithm Hash digest
SHA256 c3876e777fe05b176c6a3925ea724da4f1050950833b528b578356edbe9405ce
MD5 de6c7043c60a2e65a213989b53d31f24
BLAKE2b-256 0cc91e8010bc32c40fb85223ea3b38d268eee5f0626414bf41572ce6aba5f1f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptvc_sdk-1.0.0b5.tar.gz:

Publisher: release.yml on promptvc/promptvc-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file promptvc_sdk-1.0.0b5-py3-none-any.whl.

File metadata

  • Download URL: promptvc_sdk-1.0.0b5-py3-none-any.whl
  • Upload date:
  • Size: 44.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptvc_sdk-1.0.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 aabfe6bbda8464370c00ddfe1235117faff7b746ab223bfa049734c2e86fdb4a
MD5 a73e0d9d613372edf6361af1a90e4063
BLAKE2b-256 8f48e15d780cbf829c1ce19094438f6cb0b709d99a6d53ad3cfde3b316410459

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptvc_sdk-1.0.0b5-py3-none-any.whl:

Publisher: release.yml on promptvc/promptvc-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page