Git for prompts - version, diff, and test your LLM prompts across any model
Project description
PromptDiff
Git for prompts. Version your LLM prompts, diff them word-by-word, and run any version against Gemini, Groq, or any other LiteLLM-supported model — all from the terminal.
promptdiff commit summarizer "Summarize: {text}" -m "v1: baseline"
promptdiff commit summarizer "Summarize concisely in one sentence: {text}" -m "v2: tighter constraint"
promptdiff diff summarizer 1 2
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash
Why PromptDiff
If you've ever overwritten a prompt and lost the version that actually worked, or pasted two prompt drafts into a doc just to eyeball what changed — PromptDiff is the tool for that. It treats prompts as versioned, diffable artifacts the same way Git treats code, and adds one thing Git can't: running any version against a real LLM and recording exactly what it produced, so you can trace output back to the exact prompt and model that generated it.
How it's different from other prompt tools
The prompt-tooling space has grown a lot — Promptfoo is a mature, YAML-config-based open-source tool with strong CI/eval integration, and platforms like PromptLayer, Langfuse, and Braintrust offer hosted, full-lifecycle prompt management with branching, RBAC, and observability.
PromptDiff isn't trying to compete with those. It's deliberately smaller: a local-first CLI for a single developer iterating on prompts, with zero config files, zero hosted accounts, and zero cost beyond free-tier LLM API usage. If you outgrow it — multiple collaborators, non-engineer prompt editors, compliance requirements — those other tools are the right next step. If you're a solo developer who just wants git commit-style version control for prompts without standing up infrastructure, PromptDiff is built for exactly that.
Features
- Version control for prompts — every
commitis an immutable, numbered version (v1, v2, v3...) scoped per prompt - Word-level diffs — see exactly which words changed between versions, rendered with color in your terminal (green additions, red strikethrough removals)
- Multi-model runs — execute any prompt version against multiple LLMs in a single command and compare cost, latency, and output side by side
- Output comparison —
diff-outputshows what two versions actually produced, not just what their text looks like - Free-tier friendly — built and tested against Gemini and Groq's free tiers; rate-limit and quota failures are recorded gracefully, never crash the tool
- Local-first — everything lives in a local SQLite database, no account or server required
Installation
pip install promptdiff-cli
(Or, to run from source — see Development below.)
No API key is required to install or to use commit, log, and diff. You'll only need one for run and diff-output, since those actually call a model.
Quick start
commit, log, and diff work immediately after install — no API key needed. You'll only need a key for the run and diff-output commands, which actually call a model. See API keys below.
# Create a project and switch to it
promptdiff project create my-app
promptdiff use my-app
# Create a prompt and commit your first version
promptdiff add summarizer
promptdiff commit summarizer "Summarize this in one sentence: {text}" -m "v1: baseline"
# Edit and commit a new version
promptdiff commit summarizer "Summarize concisely, max 20 words: {text}" -m "v2: tighter constraint"
# See version history
promptdiff log summarizer
# See exactly what changed
promptdiff diff summarizer 1 2
# Run a version against a model (needs an API key — see below)
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash
# Run the same version against multiple models at once
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash --model groq/llama-3.3-70b-versatile
# Compare what two versions actually produced
promptdiff diff-output summarizer 1 2 --model gemini/gemini-2.5-flash
API keys
PromptDiff uses LiteLLM under the hood, so it works with any LiteLLM-supported provider. It's built and tested primarily against free tiers:
- Gemini — get a free key at aistudio.google.com/apikey
- Groq — get a free key at console.groq.com/keys
Copy .env.example to .env in your project directory and fill in the keys you have:
GEMINI_API_KEY=your_key_here
GROQ_API_KEY=your_key_here
Commands
| Command | What it does |
|---|---|
promptdiff project create <name> |
Create a new project |
promptdiff project list |
List all projects |
promptdiff use <project> |
Set the current project (so you don't need --project on every command) |
promptdiff add <prompt> |
Add a new prompt to the current project |
promptdiff commit <prompt> <content_or_file> -m "<message>" |
Commit a new version. Accepts a literal string or a path to a file |
promptdiff log <prompt> |
Show version history |
promptdiff diff <prompt> <v1> <v2> |
Word-level diff between two versions |
promptdiff run <prompt> <version> --model <model> |
Run a version against one or more models (repeat --model to run against several at once) |
promptdiff diff-output <prompt> <v1> <v2> [--model <model>] |
Compare what two versions actually produced. If a version was run against multiple models, --model is required to avoid an ambiguous comparison |
Every command accepts -p / --project to override the current project for that one call.
Architecture
- Database: SQLite, 5 tables (
projects,prompts,versions,runs,outputs) via SQLAlchemy + Alembic migrations - Diff engine: Python's
difflib, line-level structure with word-level precision inside changed lines - Runner: LiteLLM for provider-agnostic model calls; failures (rate limits, timeouts, bad keys) are recorded as failed runs rather than crashing the CLI
- CLI: Typer + Rich
Development
git clone https://github.com/Ragu3175/promptdiff.git
cd promptdiff
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
pip install -r requirements.txt
pip install -e .
pytest -v
Status
Core CLI is complete and tested: commit, log, diff, run, and diff-output all work end-to-end against real Gemini and Groq calls. A REST API layer and web UI are planned but not yet built — the CLI is fully usable on its own today.
Contributing
Issues and PRs welcome. This is an early-stage solo project — if you run into something broken or have an idea, please open an issue.
License
MIT — see LICENSE.
Author
Built by Raguram R.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptdiff_cli-0.1.1.tar.gz.
File metadata
- Download URL: promptdiff_cli-0.1.1.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd4d6e6c32e9ba5ef27ecfaad7fc4cee11701ecc3d61dea27b5adbb0a348b5f8
|
|
| MD5 |
3330a7b6a241d6599f594be2a7b514fc
|
|
| BLAKE2b-256 |
af07ead5053d6708238fa875ecfeca3f31321c05ee9407d5bc180fd15976e8bb
|
File details
Details for the file promptdiff_cli-0.1.1-py3-none-any.whl.
File metadata
- Download URL: promptdiff_cli-0.1.1-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75950845e33607dac37fc5350c6b726fb12d4470b8f24d8f60654c21c272c59c
|
|
| MD5 |
39ac9caab6a16ccef9174d2436836914
|
|
| BLAKE2b-256 |
27740efef23e5cc5d374d6a6e7d98cc9e80255891397cbed1e7c7b0251f0d123
|