Skip to main content

Structured prompt builder and version manager for LLM engineers — typed variables, versioning, diffing, A/B testing, and audit trails

Project description

promptsmith

Structured prompt builder and version manager for LLM engineers. Typed variables, Git-friendly versioning, human-readable diffs, A/B testing, and full audit trails. Works with any LLM. Zero dependencies.

PyPI version Python License: MIT


The Problem

Every LLM engineer ends up with prompts scattered across f-strings, Notion docs, and constants files. No versioning. No way to diff what changed. No audit trail of which prompt produced which output.

# the reality
prompt = f"Summarize this in {n} words: {text}"  # in utils.py
SYSTEM = "You are helpful..."                      # in constants.py  
prompt2 = "Summarize this concisely: " + text     # in api.py

promptsmith gives your prompts the same discipline as your code.


Installation

pip install promptsmith

No dependencies. Requires Python 3.8+.


Quick Start

from promptsmith import Prompt, PromptRegistry

# Define a typed prompt
prompt = Prompt(
    name="summarizer",
    template="Summarize this {content_type} in {max_words} words:\n\n{content}",
    variables={"content_type": str, "max_words": int, "content": str},
    description="General purpose summarizer",
)

# Render it — validates types before rendering
text = prompt.render(content_type="article", max_words=100, content="...")

# Save to registry
registry = PromptRegistry("./prompts")
registry.save(prompt)

# Load anywhere in your codebase
p = registry.load("summarizer")
text = p.render(content_type="email", max_words=50, content="...")

Core Concepts

Typed Variables

Variables are typed and validated before rendering — catch bugs before the LLM call:

prompt = Prompt(
    name="classifier",
    template="Classify this text as {label_a} or {label_b}:\n{text}",
    variables={
        "label_a": str,
        "label_b": str,
        "text": str,
    }
)

# Type errors caught early
prompt.render(label_a="positive", label_b="negative", text=42)
# PromptRenderError: Variable 'text' expected str, got int

Versioning

Every change creates a new version automatically:

p1 = registry.load("summarizer")  # 1.0.0

p2 = p1.update(
    template="Summarize this {content_type} concisely in under {max_words} words:\n\n{content}",
    changelog="Added 'concisely' — tighter outputs"
)
registry.save(p2)  # saves as 1.0.1

# Load specific version
old = registry.load("summarizer", version="1.0.0")
new = registry.load("summarizer", version="1.0.1")
new = registry.load("summarizer")  # latest

Human-Readable Diffs

print(registry.diff("summarizer", "1.0.0", "1.0.1"))
── Template ─────────────────────────────────────────
--- template (1.0.0)
+++ template (1.0.1)
@@ -1 +1 @@
-Summarize this {content_type} in {max_words} words:
+Summarize this {content_type} concisely in under {max_words} words:

── Metadata ─────────────────────────────────────────
  1.0.0 → 1.0.1
  changelog: Added 'concisely' — tighter outputs

A/B Testing

result = registry.ab_test(
    name="summarizer",
    version_a="1.0.0",
    version_b="1.0.1",
    inputs={"content_type": "article", "max_words": 100, "content": article_text},
    runner=lambda prompt: openai.chat.completions.create(
        model="gpt-4", messages=[{"role": "user", "content": prompt}]
    ).choices[0].message.content,
    scorer=lambda a, b: len(b) - len(a),  # positive = B wins
)

result.print_comparison()
print(f"Winner: {result.winner}")

Chat Models (System + User)

prompt = Prompt(
    name="assistant",
    template="Answer this question: {question}",
    system="You are a helpful assistant. Be concise.",
    variables={"question": str},
)

messages = prompt.render_messages(question="What is BPE tokenization?")
# [{"role": "system", "content": "You are..."}, {"role": "user", "content": "Answer..."}]

response = openai.chat.completions.create(model="gpt-4", messages=messages)

Version History & Audit Trail

# Full history
for entry in registry.history("summarizer"):
    print(f"v{entry['version']}{entry['changelog']} ({entry['created_at'][:10]})")

# Past A/B results
for run in registry.ab_history("summarizer"):
    print(f"{run['version_a']} vs {run['version_b']} → winner: {run['winner']}")

Storage (Git-Friendly)

prompts/
├── promptsmith.db          ← SQLite index for fast queries
├── summarizer/
│   ├── 1.0.0.json          ← full prompt definition
│   └── 1.0.1.json
└── classifier/
    └── 1.0.0.json

Commit the prompts/ directory to Git — every prompt change is tracked just like code.


API Reference

Prompt

Prompt(
    name,           # Unique identifier
    template,       # Text with {variable} placeholders
    variables=None, # dict of name → type or PromptVariable
    version="1.0.0",
    description="",
    changelog="",
    tags=[],
    system=None,    # System prompt for chat models
    metadata={},
)
Method Description
render(**kwargs) Render prompt, raises on type errors
render_messages(**kwargs) Returns OpenAI-style messages list
update(template, ...) Create new version with changes
validate(**kwargs) Check inputs without rendering
to_dict() / from_dict() Serialization
to_json() / from_json() JSON serialization

PromptRegistry

Method Description
save(prompt) Save to disk + index
load(name, version=None) Load latest or specific version
history(name) All versions with changelogs
diff(name, v_a, v_b) Human-readable diff
ab_test(name, v_a, v_b, inputs, runner, scorer) A/B test two versions
list(tag=None) List all prompts
names() All prompt names
delete(name, version=None) Delete version(s)
export_all(path) Export all prompts to JSON

Running Tests

pip install pytest
pytest tests/ -v

License

MIT © prabhay759

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptsmithv2-1.0.0.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptsmithv2-1.0.0-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file promptsmithv2-1.0.0.tar.gz.

File metadata

  • Download URL: promptsmithv2-1.0.0.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promptsmithv2-1.0.0.tar.gz
Algorithm Hash digest
SHA256 1785fec2c28f391fb2ddf3f8e5afb7d5de8cac42f44cde9d6423ab46a0ae62f6
MD5 156e98840671821ed0018a821e37901d
BLAKE2b-256 be0c76b901d8e44d25f0e62ebb1483c9b1dfd5de897c795f8aa6b693b1d04845

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptsmithv2-1.0.0.tar.gz:

Publisher: publish.yml on prabhay759/promptsmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file promptsmithv2-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: promptsmithv2-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for promptsmithv2-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d400546c4c02dc75f3e182125970b8d8ac7689c43651b2adfbad411e17943f26
MD5 7d42976a579c3630dc57ffddf900a1f0
BLAKE2b-256 c0a26aa6bf769f1654afea1dee4d7c72868642005c7756f87e56d612eb835a02

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptsmithv2-1.0.0-py3-none-any.whl:

Publisher: publish.yml on prabhay759/promptsmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page