Skip to main content

Git-native prompt version control & CI guardrail tool

Project description

prompt-git-manager

Git-native prompt version control & CI guardrail tool

InstallationQuick StartCommandsCI IntegrationDocsContributing中文

Python 3.10+ License Tests Coverage


Why prompt-git-manager?

The Problem

Prompt engineering is becoming critical to AI applications, but managing prompts is chaotic:

  • 🔀 No version control: Prompts scattered in code, docs, and chat logs
  • 📊 No metrics: Can't measure if changes improve or degrade performance
  • 🚫 No guardrails: Breaking changes ship without detection
  • 🔍 No diff tools: Text diff is meaningless for structured prompts

The Solution

prompt-git-manager brings software engineering best practices to prompt management:

Feature Traditional Approach prompt-git-manager
Version Control Copy-paste in docs Git-native commits
Change Detection Manual review Semantic diff
Quality Gates Hope for the best Automated evaluation
Rollback "What was the old prompt?" git checkout

Key Differentiators

  • Zero Infrastructure: No servers, no databases, no SaaS dependencies
  • Git Native: Prompts are files, versions are commits
  • CI First: Built for GitHub Actions, pre-commit, and PR workflows
  • Offline Capable: Works without LLM API access (rule-based evaluation)
  • LLM Enhanced: Optional LLM-as-judge evaluation with multi-provider support (OpenAI, Anthropic, local models)

Documentation

📖 Full Documentation

Document Description
Quick Start Get started in 5 minutes
CLI Reference All commands detailed
Prompt Schema Prompt file format spec
Evaluation Guide Evaluation complete reference
Dataset Guide Create evaluation datasets
Configuration Configuration reference
Python API Programmatic usage
Architecture Internal implementation
Best Practices Prompt management tips
Migration Guide Migrate from other tools
Troubleshooting Common issues & solutions
Roadmap Future plans

Installation

Using uv (Recommended)

uv pip install prompt-git-manager

Using pip

pip install prompt-git-manager

From Source

git clone https://github.com/ChanChiChoi/prompt-git-manager.git
cd prompt-git-manager
uv sync

Verify Installation

pg --version
# prompt-git-manager 0.1.0

Quick Start

1. Initialize a Project

cd your-project
pg init

This creates:

.prompts/
├── config.json      # Project settings
└── .gitignore       # Internal files

2. Add Your Prompts

# Create a prompt file
cat > qa_prompt.yaml << 'EOF'
name: qa-assistant
version: "1.0.0"
system_prompt: "You are a helpful assistant."
user_template: "Answer: {{question}}"
variables:
  question:
    type: string
    default: "What is Python?"
constraints:
  - Be concise
  - Use examples
metadata:
  author: your-name
EOF

# Add to tracking
pg add qa_prompt.yaml

3. Commit Changes

pg commit -m "Initial QA prompt"

4. Review Changes

# Make some changes to the prompt...
vim .prompts/qa_prompt.yaml

# See semantic diff
pg diff --semantic

5. Evaluate Against Dataset

# Create a test dataset
cat > fixtures/dataset.jsonl << 'EOF'
{"input": "What is Python?", "expected_output": "Python is a programming language"}
{"input": "What is Git?", "expected_output": "Git is a version control system"}
EOF

# Run evaluation
pg eval --dataset fixtures/dataset.jsonl --threshold 0.05

Commands

pg init

Initialize prompt-git-manager in your repository.

pg init [--dry-run]

pg add

Add a prompt file to version tracking.

pg add <file> [--dry-run]

Supported formats: YAML (.yaml, .yml), JSON (.json)

Required fields:

  • name: Prompt identifier
  • system_prompt: System message
  • user_template: User message template with {{variables}}

pg commit

Commit prompt changes with structured metadata.

pg commit -m "message" [--dry-run]

Generates commit record:

{
  "hash": "abc123",
  "timestamp": "2026-05-04T10:30:00",
  "changed_files": [".prompts/qa_prompt.yaml"],
  "validation_status": "pass",
  "message": "Update QA prompt"
}

pg diff

Show differences between prompt versions.

pg diff [file] [--semantic] [--json]

Semantic Analysis:

  • Variable changes ({{old_var}}{{new_var}})
  • Constraint changes (added/removed rules)
  • Tone shifts (formal ↔ casual)
  • Role shifts (assistant persona changes)

Risk Levels:

  • 🟢 LOW: Minor changes, no semantic impact
  • 🟡 MEDIUM: Constraint or tone changes
  • 🔴 HIGH: Role or variable removal

pg eval

Evaluate prompts against a dataset.

# Rule-based evaluation (offline, no LLM dependency)
pg eval --dataset <file.jsonl> [--threshold 0.05] [--json]

# LLM-enhanced evaluation
pg eval --dataset <file.jsonl> --provider openai --model gpt-4

# LLM-as-judge evaluation (more accurate scoring)
pg eval --dataset <file.jsonl> --provider openai --model gpt-4 --judge

# Compare multiple models
pg eval --dataset <file.jsonl> --compare-models gpt-3.5,gpt-4

Dataset Format:

{"input": "question", "expected_output": "answer", "metadata": {}}

Metrics:

  • accuracy_delta: Change in accuracy (-1 to +1)
  • token_cost_delta: Change in token usage
  • consistency_score: Agreement between versions (0-1)

LLM Providers:

  • OpenAI (gpt-3.5, gpt-4, etc.)
  • Anthropic (claude-2, claude-3, etc.)
  • Ollama (local models)
  • Any LiteLLM-supported provider

pg ci init

Generate CI/CD configuration files.

pg ci init [--dry-run]

Generates:

  • .github/workflows/prompt-guard.yml - GitHub Actions workflow
  • .pre-commit-config.yaml - Pre-commit hooks
  • scripts/bump_version.sh - Version management

CI Integration

GitHub Actions

Automatic Setup

pg ci init

Manual Setup

Create .github/workflows/prompt-guard.yml:

name: Prompt Guard

on:
  pull_request:
    paths:
      - '.prompts/**'

jobs:
  prompt-guard:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-python@v5
        with:
          python-version: '3.10'

      - name: Install prompt-git-manager
        run: pip install prompt-git-manager

      - name: Run diff
        run: pg diff --semantic --json > diff.json

      - name: Run evaluation
        run: pg eval --dataset fixtures/dataset.jsonl --threshold 0.05

      - name: Comment PR
        if: failure()
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const diff = fs.readFileSync('diff.json', 'utf8');
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `## ❌ Prompt Guard Failed\n\n\`\`\`json\n${diff}\n\`\`\``
            });

Pre-commit Hooks

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: prompt-diff
        name: Prompt Diff Check
        entry: pg diff --fail-on=high
        language: system
        files: '\.prompts/.*\.ya?ml$'
        pass_filenames: false

Install hooks:

pre-commit install

Local CI Script

#!/bin/bash
# scripts/ci_check.sh

set -e

echo "Running prompt checks..."

# Run diff
pg diff --semantic --json > diff.json

# Run evaluation
pg eval --dataset fixtures/dataset.jsonl --threshold 0.05 --json > eval.json

echo "All checks passed!"

Configuration

Project Config

.prompts/config.json:

{
  "version": "0.1.0",
  "eval_threshold": 0.05,
  "model_provider": "openai",
  "default_model": "gpt-3.5-turbo",
  "auto_validate": true
}

Environment Variables

Variable Description Default
PROMPT_GIT_MODEL LLM model for evaluation none
PROMPT_GIT_THRESHOLD Default eval threshold 0.05
OPENAI_API_KEY OpenAI API key -
ANTHROPIC_API_KEY Anthropic API key -

Benchmark

Performance

Operation Time Notes
pg init <100ms Creates directory structure
pg add <200ms Validates + copies file
pg commit <500ms Git commit + record
pg diff <300ms Structured diff analysis
pg eval (20 samples) <1s Rule-based evaluation
pg eval (100 samples) <5s Rule-based evaluation

Test Coverage

Module Coverage
cli.py 42%
schema.py 90%
diff_engine.py 90%
evaluator.py 99%
ci_gen.py -
Total 74%

Test Count

Suite Tests
test_cli.py 14
test_diff.py 29
test_eval.py 33
test_ci_gen.py 40+
Total 116+

Architecture

prompt-git-manager/
├── src/promptgit/
│   ├── __init__.py          # Version
│   ├── cli.py               # Typer CLI entry point
│   ├── schema.py            # Pydantic models
│   ├── diff_engine.py       # Semantic diff engine
│   ├── evaluator.py         # Dataset evaluation
│   ├── ci_gen.py            # CI/CD generator
│   └── utils.py             # Git + Rich helpers
├── tests/
│   ├── conftest.py          # Fixtures
│   ├── test_cli.py
│   ├── test_diff.py
│   ├── test_eval.py
│   └── test_ci_gen.py
├── fixtures/
│   ├── dataset.jsonl        # Test dataset
│   └── prompts/             # Edge case prompts
├── examples/
│   ├── customer_service.yaml
│   ├── code_generation.yaml
│   └── data_extraction.yaml
├── docs/
│   ├── cli_reference.md
│   └── architecture.md
└── .github/
    └── workflows/
        ├── prompt-guard.yml
        └── publish.yml

Contributing

Development Setup

# Clone repository
git clone https://github.com/ChanChiChoi/prompt-git-manager.git
cd prompt-git-manager

# Install with dev dependencies
uv sync --extra dev

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=promptgit --cov-report=html

Code Style

  • Python 3.10+ with type hints
  • Pydantic for data validation
  • Typer for CLI
  • Rich for terminal output
  • pytest for testing

Pull Request Process

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Running Checks

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Format
uv run ruff format src/

# All tests
uv run pytest -v

Releasing

# Bump version
./scripts/bump_version.sh patch  # or minor, major

# Push with tags
git push && git push --tags

# GitHub Action will publish to PyPI automatically

Quick PR Demo with gh CLI

Create a PR with Prompt Changes

# 1. Create feature branch
git checkout -b feature/update-qa-prompt

# 2. Make prompt changes
vim .prompts/qa_prompt.yaml

# 3. Commit with prompt-git-manager
pg commit -m "Improve QA prompt accuracy"

# 4. Push branch
git push -u origin feature/update-qa-prompt

# 5. Create PR with gh CLI
gh pr create \
  --title "Improve QA prompt accuracy" \
  --body "$(cat <<'EOF'
## Summary
- Updated system prompt for better context understanding
- Added few-shot examples to user template
- Adjusted constraints for more consistent outputs

## Prompt Diff
$(pg diff --semantic)

## Evaluation Results
$(pg eval --dataset fixtures/dataset.jsonl --json)

## Checklist
- [x] Semantic diff reviewed
- [x] Evaluation passed (threshold: 5%)
- [ ] Team review
EOF
)"

# 6. View PR
gh pr view --web

Check PR Status

# List open PRs
gh pr list

# View specific PR
gh pr view 42

# Check CI status
gh pr checks 42

# Merge when ready
gh pr merge 42 --squash

FAQ

Q: Why not use Langfuse/Weights & Biases?

A: Those are great runtime monitoring tools. prompt-git-manager focuses on development-time workflow:

  • Git-native (no new tool to learn)
  • CI-first (catches issues before deploy)
  • Zero infrastructure (no servers to maintain)

Q: Can I use this with private prompts?

A: Yes! Prompts stay in your private Git repo. No data is sent externally unless you enable LLM evaluation.

Q: How does rule-based evaluation work?

A: Without LLM APIs, we use keyword matching and text similarity as heuristics. It's less accurate but:

  • Works offline
  • No API costs
  • Fast execution
  • Deterministic results

Q: What prompt formats are supported?

A: YAML and JSON with this structure:

name: string
version: string
system_prompt: string
user_template: string  # with {{variables}}
messages: []           # optional, multi-turn history [{role, content}]
variables: {}
constraints: []
metadata: {}

License

MIT License - see LICENSE for details.


Acknowledgments


Made with ❤️ for the AI engineering community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompt_git_manager-0.2.1.tar.gz (180.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prompt_git_manager-0.2.1-py3-none-any.whl (40.7 kB view details)

Uploaded Python 3

File details

Details for the file prompt_git_manager-0.2.1.tar.gz.

File metadata

  • Download URL: prompt_git_manager-0.2.1.tar.gz
  • Upload date:
  • Size: 180.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for prompt_git_manager-0.2.1.tar.gz
Algorithm Hash digest
SHA256 440df1018227de098d3a2265aa08b5c0800ea0b941c08e0abc35d0d12353be57
MD5 ecdd85ee7651a0a8e1716141a780a984
BLAKE2b-256 213715932199e499e4acdbc7244119ee5cb81e7cd85d0bfd3563805f8d53d557

See more details on using hashes here.

File details

Details for the file prompt_git_manager-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for prompt_git_manager-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 68079dd1b9cc25fbf506c68ad8814dc031ee1f32d780e66684a1fab1f967cb00
MD5 9617b999a2410c9e170e0db025e974b5
BLAKE2b-256 039fa228671e3abce7eecaa766d179bc29e7154de73fc5942138c20cbc1985f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page