Git-native prompt version control & CI guardrail tool
Project description
prompt-git-manager
Git-native prompt version control & CI guardrail tool
Installation • Quick Start • Commands • CI Integration • Contributing • 中文
Why prompt-git-manager?
The Problem
Prompt engineering is becoming critical to AI applications, but managing prompts is chaotic:
- 🔀 No version control: Prompts scattered in code, docs, and chat logs
- 📊 No metrics: Can't measure if changes improve or degrade performance
- 🚫 No guardrails: Breaking changes ship without detection
- 🔍 No diff tools: Text diff is meaningless for structured prompts
The Solution
prompt-git-manager brings software engineering best practices to prompt management:
| Feature | Traditional Approach | prompt-git-manager |
|---|---|---|
| Version Control | Copy-paste in docs | Git-native commits |
| Change Detection | Manual review | Semantic diff |
| Quality Gates | Hope for the best | Automated evaluation |
| Rollback | "What was the old prompt?" | git checkout |
Key Differentiators
- Zero Infrastructure: No servers, no databases, no SaaS dependencies
- Git Native: Prompts are files, versions are commits
- CI First: Built for GitHub Actions, pre-commit, and PR workflows
- Offline Capable: Works without LLM API access (rule-based evaluation)
Installation
Using uv (Recommended)
uv pip install prompt-git-manager
Using pip
pip install prompt-git-manager
From Source
git clone https://github.com/ChanChiChoi/prompt-git-manager.git
cd prompt-git-manager
uv sync
Verify Installation
pg --version
# prompt-git-manager 0.1.0
Quick Start
1. Initialize a Project
cd your-project
pg init
This creates:
.prompts/
├── config.json # Project settings
└── .gitignore # Internal files
2. Add Your Prompts
# Create a prompt file
cat > qa_prompt.yaml << 'EOF'
name: qa-assistant
version: "1.0.0"
system_prompt: "You are a helpful assistant."
user_template: "Answer: {{question}}"
variables:
question:
type: string
default: "What is Python?"
constraints:
- Be concise
- Use examples
metadata:
author: your-name
EOF
# Add to tracking
pg add qa_prompt.yaml
3. Commit Changes
pg commit -m "Initial QA prompt"
4. Review Changes
# Make some changes to the prompt...
vim .prompts/qa_prompt.yaml
# See semantic diff
pg diff --semantic
5. Evaluate Against Dataset
# Create a test dataset
cat > fixtures/dataset.jsonl << 'EOF'
{"input": "What is Python?", "expected_output": "Python is a programming language"}
{"input": "What is Git?", "expected_output": "Git is a version control system"}
EOF
# Run evaluation
pg eval --dataset fixtures/dataset.jsonl --threshold 0.05
Commands
pg init
Initialize prompt-git-manager in your repository.
pg init [--dry-run]
pg add
Add a prompt file to version tracking.
pg add <file> [--dry-run]
Supported formats: YAML (.yaml, .yml), JSON (.json)
Required fields:
name: Prompt identifiersystem_prompt: System messageuser_template: User message template with{{variables}}
pg commit
Commit prompt changes with structured metadata.
pg commit -m "message" [--dry-run]
Generates commit record:
{
"hash": "abc123",
"timestamp": "2024-01-15T10:30:00",
"changed_files": [".prompts/qa_prompt.yaml"],
"validation_status": "pass",
"message": "Update QA prompt"
}
pg diff
Show differences between prompt versions.
pg diff [file] [--semantic] [--json]
Semantic Analysis:
- Variable changes (
{{old_var}}→{{new_var}}) - Constraint changes (added/removed rules)
- Tone shifts (formal ↔ casual)
- Role shifts (assistant persona changes)
Risk Levels:
- 🟢 LOW: Minor changes, no semantic impact
- 🟡 MEDIUM: Constraint or tone changes
- 🔴 HIGH: Role or variable removal
pg eval
Evaluate prompts against a dataset.
pg eval --dataset <file.jsonl> [--threshold 0.05] [--json]
Dataset Format:
{"input": "question", "expected_output": "answer", "metadata": {}}
Metrics:
accuracy_delta: Change in accuracy (-1 to +1)token_cost_delta: Change in token usageconsistency_score: Agreement between versions (0-1)
pg ci init
Generate CI/CD configuration files.
pg ci init [--dry-run]
Generates:
.github/workflows/prompt-guard.yml- GitHub Actions workflow.pre-commit-config.yaml- Pre-commit hooksscripts/bump_version.sh- Version management
CI Integration
GitHub Actions
Automatic Setup
pg ci init
Manual Setup
Create .github/workflows/prompt-guard.yml:
name: Prompt Guard
on:
pull_request:
paths:
- '.prompts/**'
jobs:
prompt-guard:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install prompt-git-manager
run: pip install prompt-git-manager
- name: Run diff
run: pg diff --semantic --json > diff.json
- name: Run evaluation
run: pg eval --dataset fixtures/dataset.jsonl --threshold 0.05
- name: Comment PR
if: failure()
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const diff = fs.readFileSync('diff.json', 'utf8');
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: `## ❌ Prompt Guard Failed\n\n\`\`\`json\n${diff}\n\`\`\``
});
Pre-commit Hooks
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: prompt-diff
name: Prompt Diff Check
entry: pg diff --fail-on=high
language: system
files: '\.prompts/.*\.ya?ml$'
pass_filenames: false
Install hooks:
pre-commit install
Local CI Script
#!/bin/bash
# scripts/ci_check.sh
set -e
echo "Running prompt checks..."
# Run diff
pg diff --semantic --json > diff.json
# Run evaluation
pg eval --dataset fixtures/dataset.jsonl --threshold 0.05 --json > eval.json
echo "All checks passed!"
Configuration
Project Config
.prompts/config.json:
{
"version": "0.1.0",
"eval_threshold": 0.05,
"model_provider": "openai",
"default_model": "gpt-3.5-turbo",
"auto_validate": true
}
Environment Variables
| Variable | Description | Default |
|---|---|---|
PROMPT_GIT_MODEL |
LLM model for evaluation | none |
PROMPT_GIT_THRESHOLD |
Default eval threshold | 0.05 |
OPENAI_API_KEY |
OpenAI API key | - |
ANTHROPIC_API_KEY |
Anthropic API key | - |
Benchmark
Performance
| Operation | Time | Notes |
|---|---|---|
pg init |
<100ms | Creates directory structure |
pg add |
<200ms | Validates + copies file |
pg commit |
<500ms | Git commit + record |
pg diff |
<300ms | Structured diff analysis |
pg eval (20 samples) |
<1s | Rule-based evaluation |
pg eval (100 samples) |
<5s | Rule-based evaluation |
Test Coverage
| Module | Coverage |
|---|---|
| cli.py | 42% |
| schema.py | 90% |
| diff_engine.py | 90% |
| evaluator.py | 99% |
| ci_gen.py | - |
| Total | 74% |
Test Count
| Suite | Tests |
|---|---|
| test_cli.py | 14 |
| test_diff.py | 29 |
| test_eval.py | 33 |
| test_ci_gen.py | 40+ |
| Total | 116+ |
Architecture
prompt-git-manager/
├── src/promptgit/
│ ├── __init__.py # Version
│ ├── cli.py # Typer CLI entry point
│ ├── schema.py # Pydantic models
│ ├── diff_engine.py # Semantic diff engine
│ ├── evaluator.py # Dataset evaluation
│ ├── ci_gen.py # CI/CD generator
│ └── utils.py # Git + Rich helpers
├── tests/
│ ├── conftest.py # Fixtures
│ ├── test_cli.py
│ ├── test_diff.py
│ ├── test_eval.py
│ └── test_ci_gen.py
├── fixtures/
│ ├── dataset.jsonl # Test dataset
│ └── prompts/ # Edge case prompts
├── examples/
│ ├── customer_service.yaml
│ ├── code_generation.yaml
│ └── data_extraction.yaml
├── docs/
│ ├── cli_reference.md
│ └── architecture.md
└── .github/
└── workflows/
├── prompt-guard.yml
└── publish.yml
Contributing
Development Setup
# Clone repository
git clone https://github.com/ChanChiChoi/prompt-git-manager.git
cd prompt-git-manager
# Install with dev dependencies
uv sync --extra dev
# Run tests
uv run pytest
# Run with coverage
uv run pytest --cov=promptgit --cov-report=html
Code Style
- Python 3.10+ with type hints
- Pydantic for data validation
- Typer for CLI
- Rich for terminal output
- pytest for testing
Pull Request Process
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
Running Checks
# Type checking
uv run mypy src/
# Linting
uv run ruff check src/
# Format
uv run ruff format src/
# All tests
uv run pytest -v
Releasing
# Bump version
./scripts/bump_version.sh patch # or minor, major
# Push with tags
git push && git push --tags
# GitHub Action will publish to PyPI automatically
Quick PR Demo with gh CLI
Create a PR with Prompt Changes
# 1. Create feature branch
git checkout -b feature/update-qa-prompt
# 2. Make prompt changes
vim .prompts/qa_prompt.yaml
# 3. Commit with prompt-git-manager
pg commit -m "Improve QA prompt accuracy"
# 4. Push branch
git push -u origin feature/update-qa-prompt
# 5. Create PR with gh CLI
gh pr create \
--title "Improve QA prompt accuracy" \
--body "$(cat <<'EOF'
## Summary
- Updated system prompt for better context understanding
- Added few-shot examples to user template
- Adjusted constraints for more consistent outputs
## Prompt Diff
$(pg diff --semantic)
## Evaluation Results
$(pg eval --dataset fixtures/dataset.jsonl --json)
## Checklist
- [x] Semantic diff reviewed
- [x] Evaluation passed (threshold: 5%)
- [ ] Team review
EOF
)"
# 6. View PR
gh pr view --web
Check PR Status
# List open PRs
gh pr list
# View specific PR
gh pr view 42
# Check CI status
gh pr checks 42
# Merge when ready
gh pr merge 42 --squash
FAQ
Q: Why not use Langfuse/Weights & Biases?
A: Those are great runtime monitoring tools. prompt-git-manager focuses on development-time workflow:
- Git-native (no new tool to learn)
- CI-first (catches issues before deploy)
- Zero infrastructure (no servers to maintain)
Q: Can I use this with private prompts?
A: Yes! Prompts stay in your private Git repo. No data is sent externally unless you enable LLM evaluation.
Q: How does rule-based evaluation work?
A: Without LLM APIs, we use keyword matching and text similarity as heuristics. It's less accurate but:
- Works offline
- No API costs
- Fast execution
- Deterministic results
Q: What prompt formats are supported?
A: YAML and JSON with this structure:
name: string
version: string
system_prompt: string
user_template: string # with {{variables}}
variables: {}
constraints: []
metadata: {}
License
MIT License - see LICENSE for details.
Acknowledgments
- Typer - CLI framework
- Pydantic - Data validation
- GitPython - Git integration
- Rich - Terminal formatting
Made with ❤️ for the AI engineering community
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prompt_git_manager-0.1.2.tar.gz.
File metadata
- Download URL: prompt_git_manager-0.1.2.tar.gz
- Upload date:
- Size: 85.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79e977d9f60d94270dd2491009322583be89cfb48240850371ec9b435f1b42ef
|
|
| MD5 |
65abed866f2e3663ce6045e4cfade110
|
|
| BLAKE2b-256 |
f06167be999578d3a1d63f62654264f02210de513e5316403d820900a4a72965
|
File details
Details for the file prompt_git_manager-0.1.2-py3-none-any.whl.
File metadata
- Download URL: prompt_git_manager-0.1.2-py3-none-any.whl
- Upload date:
- Size: 31.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3bab073116b7bc892d858787592afe755bfabc649d38c04a4c106fe5723f491
|
|
| MD5 |
8ff5c0c55a1ae5e6e12fdd9b2b1f5b16
|
|
| BLAKE2b-256 |
9e591a276c58b4d379e46226720edff57c3cd8d72b00ca4a63bb359adfc7cff9
|