Skip to main content

Git for prompts - version, diff, and test your LLM prompts across any model

Project description

PromptDiff

Git for prompts. Version your LLM prompts, diff them word-by-word, and run any version against Gemini, Groq, or any other LiteLLM-supported model — all from the terminal.

promptdiff commit summarizer "Summarize: {text}" -m "v1: baseline"
promptdiff commit summarizer "Summarize concisely in one sentence: {text}" -m "v2: tighter constraint"
promptdiff diff summarizer 1 2
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash

Why PromptDiff

If you've ever overwritten a prompt and lost the version that actually worked, or pasted two prompt drafts into a doc just to eyeball what changed — PromptDiff is the tool for that. It treats prompts as versioned, diffable artifacts the same way Git treats code, and adds one thing Git can't: running any version against a real LLM and recording exactly what it produced, so you can trace output back to the exact prompt and model that generated it.

How it's different from other prompt tools

The prompt-tooling space has grown a lot — Promptfoo is a mature, YAML-config-based open-source tool with strong CI/eval integration, and platforms like PromptLayer, Langfuse, and Braintrust offer hosted, full-lifecycle prompt management with branching, RBAC, and observability.

PromptDiff isn't trying to compete with those. It's deliberately smaller: a local-first CLI for a single developer iterating on prompts, with zero config files, zero hosted accounts, and zero cost beyond free-tier LLM API usage. If you outgrow it — multiple collaborators, non-engineer prompt editors, compliance requirements — those other tools are the right next step. If you're a solo developer who just wants git commit-style version control for prompts without standing up infrastructure, PromptDiff is built for exactly that.

Features

  • Version control for prompts — every commit is an immutable, numbered version (v1, v2, v3...) scoped per prompt
  • Word-level diffs — see exactly which words changed between versions, rendered with color in your terminal (green additions, red strikethrough removals)
  • Multi-model runs — execute any prompt version against multiple LLMs in a single command and compare cost, latency, and output side by side
  • Output comparisondiff-output shows what two versions actually produced, not just what their text looks like
  • Free-tier friendly — built and tested against Gemini and Groq's free tiers; rate-limit and quota failures are recorded gracefully, never crash the tool
  • Local-first — everything lives in a local SQLite database, no account or server required

Installation

pip install promptdiff-cli

(Or, to run from source — see Development below.)

Quick start

# Create a project and switch to it
promptdiff project create my-app
promptdiff use my-app

# Create a prompt and commit your first version
promptdiff add summarizer
promptdiff commit summarizer "Summarize this in one sentence: {text}" -m "v1: baseline"

# Edit and commit a new version
promptdiff commit summarizer "Summarize concisely, max 20 words: {text}" -m "v2: tighter constraint"

# See version history
promptdiff log summarizer

# See exactly what changed
promptdiff diff summarizer 1 2

# Run a version against a model (needs an API key — see below)
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash

# Run the same version against multiple models at once
promptdiff run summarizer 2 --model gemini/gemini-2.5-flash --model groq/llama-3.3-70b-versatile

# Compare what two versions actually produced
promptdiff diff-output summarizer 1 2 --model gemini/gemini-2.5-flash

API keys

PromptDiff uses LiteLLM under the hood, so it works with any LiteLLM-supported provider. It's built and tested primarily against free tiers:

Copy .env.example to .env in your project directory and fill in the keys you have:

GEMINI_API_KEY=your_key_here
GROQ_API_KEY=your_key_here

Commands

Command What it does
promptdiff project create <name> Create a new project
promptdiff project list List all projects
promptdiff use <project> Set the current project (so you don't need --project on every command)
promptdiff add <prompt> Add a new prompt to the current project
promptdiff commit <prompt> <content_or_file> -m "<message>" Commit a new version. Accepts a literal string or a path to a file
promptdiff log <prompt> Show version history
promptdiff diff <prompt> <v1> <v2> Word-level diff between two versions
promptdiff run <prompt> <version> --model <model> Run a version against one or more models (repeat --model to run against several at once)
promptdiff diff-output <prompt> <v1> <v2> [--model <model>] Compare what two versions actually produced. If a version was run against multiple models, --model is required to avoid an ambiguous comparison

Every command accepts -p / --project to override the current project for that one call.

Architecture

  • Database: SQLite, 5 tables (projects, prompts, versions, runs, outputs) via SQLAlchemy + Alembic migrations
  • Diff engine: Python's difflib, line-level structure with word-level precision inside changed lines
  • Runner: LiteLLM for provider-agnostic model calls; failures (rate limits, timeouts, bad keys) are recorded as failed runs rather than crashing the CLI
  • CLI: Typer + Rich

Development

git clone https://github.com/Ragu3175/promptdiff.git
cd promptdiff
python -m venv venv
venv\Scripts\activate   # Windows
# source venv/bin/activate   # macOS/Linux
pip install -r requirements.txt
pip install -e .
pytest -v

Status

Core CLI is complete and tested: commit, log, diff, run, and diff-output all work end-to-end against real Gemini and Groq calls. A REST API layer and web UI are planned but not yet built — the CLI is fully usable on its own today.

Contributing

Issues and PRs welcome. This is an early-stage solo project — if you run into something broken or have an idea, please open an issue.

License

MIT — see LICENSE.

Author

Built by Raguram R.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptdiff_cli-0.1.0.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptdiff_cli-0.1.0-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file promptdiff_cli-0.1.0.tar.gz.

File metadata

  • Download URL: promptdiff_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for promptdiff_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f191f465c7a8e78197c4e7317d45b8e4178be5171c1577df2d4bc1af4a09ded0
MD5 b13b4cdb609a458ce4817f942c910ed9
BLAKE2b-256 78a185f3b031675f7a09e7ecef81eaff0b7c7a14da8b6ebc50b43fa5cb9c42ba

See more details on using hashes here.

File details

Details for the file promptdiff_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: promptdiff_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for promptdiff_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2de848877adf4646161232060b3464e8b2ac284dcad53fa55727b0603de71c37
MD5 af7b8c99f61b6c2a36c99864f8252210
BLAKE2b-256 8d7283d9f434a254c2635c047edca6fc16249a6b58caaebcc5de6fd6922e17b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page