Version, diff, and A/B test your LLM prompts — like git for prompts.
Project description
promptlab
Git for your prompts. Version, diff, validate, and A/B test LLM prompts with confidence.
pip install promptlab
The Problem
Your prompts are the most important code you write, but you manage them as raw strings:
- ❌ Edited inline, no version history
- ❌ "Did that last prompt change work?" → No way to know
- ❌ Typo in a variable → silent hallucination
- ❌ A/B testing prompts → custom scripts every time
- ❌ Deploying a bad prompt → rollback is copy-paste
The Solution
$ promptlab init
Created .prompts/ directory
$ promptlab list
┌──────────────────────┬─────────┬────────────────────────┐
│ Prompt │ Version │ Last Modified │
├──────────────────────┼─────────┼────────────────────────┤
│ system_prompt │ v3 │ 2026-04-28 14:30 │
│ search_tool_prompt │ v2 │ 2026-04-25 09:15 │
│ summarizer │ v5 │ 2026-05-01 16:42 │
└──────────────────────┴─────────┴────────────────────────┘
$ promptlab diff system_prompt v2 v3
You are a helpful assistant.
- Be concise. Maximum 2 sentences.
+ Be thorough. Provide detailed explanations with examples.
+ Always cite sources when making factual claims.
Quick Start
1. Initialize
promptlab init
# Creates .prompts/ directory with schema
2. Create a prompt
from promptlab import Prompt
# Define a typed prompt template
system = Prompt(
name="order_analyst",
template="""You are an order analyst assistant.
The user will ask about maintenance order {{order_id}}.
Plant: {{plant}}
Priority: {{priority}}
Rules:
- Be concise and factual
- Always include the order number in your response
- If unsure, say so
""",
variables={"order_id": str, "plant": str, "priority": str},
metadata={"author": "team-alpha", "model": "gpt-4o"},
)
# Render with type validation:
rendered = system.render(order_id="4002310", plant="1010", priority="High")
# Raises TypeError if you pass wrong types or miss a variable:
system.render(order_id=123) # TypeError: 'order_id' must be str, got int
system.render(order_id="4002310") # TypeError: missing required variable 'plant'
3. Version your prompts
from promptlab import PromptStore
store = PromptStore(".prompts")
# Save a new version (auto-increments)
store.save(system) # → v1
# Edit and save again
system.template += "\n- Always be polite"
store.save(system) # → v2
# Load a specific version
v1 = store.load("order_analyst", version=1)
latest = store.load("order_analyst") # latest version
4. Diff versions
from promptlab import diff_prompts
changes = diff_prompts(store, "order_analyst", v1=1, v2=2)
print(changes)
# + - Always be polite
Or from CLI:
promptlab diff order_analyst v1 v2
5. A/B test prompts
from promptlab import ABTest
test = ABTest(
prompt_name="summarizer",
version_a=3,
version_b=4,
dataset="eval/summarize_test.jsonl",
metric="length", # or custom function
)
results = test.run()
print(results)
# Version A (v3): avg_length=45.2, avg_latency=1.2s
# Version B (v4): avg_length=32.1, avg_latency=0.9s
# Winner: v4 (shorter, faster)
6. Deploy
# Promote a version to "production"
store.promote("order_analyst", version=2, env="production")
# In your app:
prompt = store.load("order_analyst", env="production")
CLI Commands
promptlab init # Initialize prompt store
promptlab list # List all prompts with versions
promptlab show <name> # Show latest prompt content
promptlab show <name> --version 3 # Show specific version
promptlab diff <name> v1 v2 # Diff two versions
promptlab validate # Validate all prompts (types, variables)
promptlab promote <name> v3 production # Promote version to env
promptlab history <name> # Show version history
promptlab export <name> --format json # Export prompt as JSON
File Structure
.prompts/
├── prompts.yaml # Registry of all prompts
├── order_analyst/
│ ├── v1.yaml # Version 1
│ ├── v2.yaml # Version 2 (current)
│ └── metadata.yaml # Author, model, env mappings
├── summarizer/
│ ├── v1.yaml
│ ├── v2.yaml
│ ├── v3.yaml
│ └── metadata.yaml
└── eval/
└── summarize_test.jsonl # A/B test datasets
Each version file:
# .prompts/order_analyst/v2.yaml
version: 2
created: "2026-04-28T14:30:00Z"
template: |
You are an order analyst assistant.
The user will ask about maintenance order {{order_id}}.
...
variables:
order_id: { type: str, required: true }
plant: { type: str, required: true }
priority: { type: str, required: true, default: "Medium" }
metadata:
author: team-alpha
model: gpt-4o
note: "Added politeness rule"
Features
| Feature | Description |
|---|---|
| Versioning | Auto-incrementing versions, full history |
| Type Safety | Pydantic-validated variables, catches typos |
| Diffing | Compare any two versions, unified diff format |
| A/B Testing | Run evaluations with custom metrics |
| Environments | Promote versions to dev/staging/production |
| Validation | CI-ready: promptlab validate catches broken prompts |
| Git-friendly | YAML files, meaningful diffs in PRs |
| Templates | Jinja2-style {{variable}} with defaults |
| Export | JSON, YAML, or raw text output |
| Zero LLM deps | Core has no LLM SDK dependency |
CI Integration
# .github/workflows/prompts.yml
- name: Validate prompts
run: promptlab validate
# Fails if: missing variables, type errors, broken templates
Contributing
git clone https://github.com/naveenkumarbaskaran/promptlab.git
cd promptlab
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
promptlab_ai-0.1.0.tar.gz
(10.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptlab_ai-0.1.0.tar.gz.
File metadata
- Download URL: promptlab_ai-0.1.0.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c71c578bad3f5cf16a70a5453efe2de18d26903c9c59d7665d7274c57ffae63
|
|
| MD5 |
3bda33036f3ce23ab24abb79a6164671
|
|
| BLAKE2b-256 |
b3ad001453e9932368475c1a75408bf71c52913e8c1551eb13152ce31f45d0d5
|
File details
Details for the file promptlab_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: promptlab_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
617a220e50e668b568dc67a3d47aa0418b4a46bdff2b4c7fa28b346079019e65
|
|
| MD5 |
8dbab7ea9c742cea81a905d14bb0f48a
|
|
| BLAKE2b-256 |
edff5530f2e3f1080a76f2632e0bd7b8389d106e638f0957313d626691ad58a2
|