Skip to main content

Version, diff, and A/B test your LLM prompts — like git for prompts.

Project description

promptlab

PyPI version Python License: MIT Tests

Git for your prompts. Version, diff, validate, and A/B test LLM prompts with confidence.

pip install promptlab

The Problem

Your prompts are the most important code you write, but you manage them as raw strings:

  • ❌ Edited inline, no version history
  • ❌ "Did that last prompt change work?" → No way to know
  • ❌ Typo in a variable → silent hallucination
  • ❌ A/B testing prompts → custom scripts every time
  • ❌ Deploying a bad prompt → rollback is copy-paste

The Solution

$ promptlab init
Created .prompts/ directory

$ promptlab list
┌──────────────────────┬─────────┬────────────────────────┐
│ Prompt                Version  Last Modified          │
├──────────────────────┼─────────┼────────────────────────┤
│ system_prompt         v3       2026-04-28 14:30       │
│ search_tool_prompt    v2       2026-04-25 09:15       │
│ summarizer            v5       2026-05-01 16:42       │
└──────────────────────┴─────────┴────────────────────────┘

$ promptlab diff system_prompt v2 v3
  You are a helpful assistant.
- Be concise. Maximum 2 sentences.
+ Be thorough. Provide detailed explanations with examples.
+ Always cite sources when making factual claims.

Quick Start

1. Initialize

promptlab init
# Creates .prompts/ directory with schema

2. Create a prompt

from promptlab import Prompt

# Define a typed prompt template
system = Prompt(
    name="order_analyst",
    template="""You are an order analyst assistant.

The user will ask about maintenance order {{order_id}}.
Plant: {{plant}}
Priority: {{priority}}

Rules:
- Be concise and factual
- Always include the order number in your response
- If unsure, say so
""",
    variables={"order_id": str, "plant": str, "priority": str},
    metadata={"author": "team-alpha", "model": "gpt-4o"},
)

# Render with type validation:
rendered = system.render(order_id="4002310", plant="1010", priority="High")

# Raises TypeError if you pass wrong types or miss a variable:
system.render(order_id=123)  # TypeError: 'order_id' must be str, got int
system.render(order_id="4002310")  # TypeError: missing required variable 'plant'

3. Version your prompts

from promptlab import PromptStore

store = PromptStore(".prompts")

# Save a new version (auto-increments)
store.save(system)  # → v1

# Edit and save again
system.template += "\n- Always be polite"
store.save(system)  # → v2

# Load a specific version
v1 = store.load("order_analyst", version=1)
latest = store.load("order_analyst")  # latest version

4. Diff versions

from promptlab import diff_prompts

changes = diff_prompts(store, "order_analyst", v1=1, v2=2)
print(changes)
# + - Always be polite

Or from CLI:

promptlab diff order_analyst v1 v2

5. A/B test prompts

from promptlab import ABTest

test = ABTest(
    prompt_name="summarizer",
    version_a=3,
    version_b=4,
    dataset="eval/summarize_test.jsonl",
    metric="length",  # or custom function
)

results = test.run()
print(results)
# Version A (v3): avg_length=45.2, avg_latency=1.2s
# Version B (v4): avg_length=32.1, avg_latency=0.9s
# Winner: v4 (shorter, faster)

6. Deploy

# Promote a version to "production"
store.promote("order_analyst", version=2, env="production")

# In your app:
prompt = store.load("order_analyst", env="production")

CLI Commands

promptlab init                          # Initialize prompt store
promptlab list                          # List all prompts with versions
promptlab show <name>                   # Show latest prompt content
promptlab show <name> --version 3       # Show specific version
promptlab diff <name> v1 v2             # Diff two versions
promptlab validate                      # Validate all prompts (types, variables)
promptlab promote <name> v3 production  # Promote version to env
promptlab history <name>                # Show version history
promptlab export <name> --format json   # Export prompt as JSON

File Structure

.prompts/
├── prompts.yaml          # Registry of all prompts
├── order_analyst/
│   ├── v1.yaml           # Version 1
│   ├── v2.yaml           # Version 2 (current)
│   └── metadata.yaml     # Author, model, env mappings
├── summarizer/
│   ├── v1.yaml
│   ├── v2.yaml
│   ├── v3.yaml
│   └── metadata.yaml
└── eval/
    └── summarize_test.jsonl  # A/B test datasets

Each version file:

# .prompts/order_analyst/v2.yaml
version: 2
created: "2026-04-28T14:30:00Z"
template: |
  You are an order analyst assistant.
  The user will ask about maintenance order {{order_id}}.
  ...
variables:
  order_id: { type: str, required: true }
  plant: { type: str, required: true }
  priority: { type: str, required: true, default: "Medium" }
metadata:
  author: team-alpha
  model: gpt-4o
  note: "Added politeness rule"

Features

Feature Description
Versioning Auto-incrementing versions, full history
Type Safety Pydantic-validated variables, catches typos
Diffing Compare any two versions, unified diff format
A/B Testing Run evaluations with custom metrics
Environments Promote versions to dev/staging/production
Validation CI-ready: promptlab validate catches broken prompts
Git-friendly YAML files, meaningful diffs in PRs
Templates Jinja2-style {{variable}} with defaults
Export JSON, YAML, or raw text output
Zero LLM deps Core has no LLM SDK dependency

CI Integration

# .github/workflows/prompts.yml
- name: Validate prompts
  run: promptlab validate
  # Fails if: missing variables, type errors, broken templates

Contributing

git clone https://github.com/naveenkumarbaskaran/promptlab.git
cd promptlab
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptlab_ai-0.1.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptlab_ai-0.1.0-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file promptlab_ai-0.1.0.tar.gz.

File metadata

  • Download URL: promptlab_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for promptlab_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5c71c578bad3f5cf16a70a5453efe2de18d26903c9c59d7665d7274c57ffae63
MD5 3bda33036f3ce23ab24abb79a6164671
BLAKE2b-256 b3ad001453e9932368475c1a75408bf71c52913e8c1551eb13152ce31f45d0d5

See more details on using hashes here.

File details

Details for the file promptlab_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: promptlab_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for promptlab_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 617a220e50e668b568dc67a3d47aa0418b4a46bdff2b4c7fa28b346079019e65
MD5 8dbab7ea9c742cea81a905d14bb0f48a
BLAKE2b-256 edff5530f2e3f1080a76f2632e0bd7b8389d106e638f0957313d626691ad58a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page