Type-safe prompt management with automatic optimization for LLMs

These details have not been verified by PyPI

Project links

Project description

FlowPrompt

Stop guessing which prompt works. Measure it.

The only LLM framework with built-in A/B testing for prompts.

Why FlowPrompt?

Every LLM framework gives you structured outputs. Only FlowPrompt tells you which prompt actually works better.

A/B Testing - Statistical significance testing for prompt variants
Type safety - Define prompts as Python classes with full IDE support
Structured outputs - Automatic validation with Pydantic models
Multi-provider - OpenAI, Anthropic, Google, or local models via LiteLLM
Production-ready - Caching, tracing, cost tracking built-in

from flowprompt import Prompt
from pydantic import BaseModel

class ExtractUser(Prompt):
    system: str = "Extract user info from text."
    user: str = "Text: {text}"

    class Output(BaseModel):
        name: str
        age: int

result = ExtractUser(text="John is 25").run(model="gpt-4o")
print(result.name)  # "John"
print(result.age)   # 25

Installation

pip install flowprompt-ai

Note: The package is installed as flowprompt-ai but imported as flowprompt

Optional extras:

pip install flowprompt-ai[all]        # Everything
pip install flowprompt-ai[cli]        # CLI tools
pip install flowprompt-ai[tracing]    # OpenTelemetry support
pip install flowprompt-ai[multimodal] # Images, PDFs, audio, video

Features at a Glance

Feature	What it does
A/B Testing	Statistical significance testing for prompts
Structured Outputs	Type-safe responses with Pydantic validation
Multi-Provider	OpenAI, Anthropic, Google, Ollama via LiteLLM
Optimization	DSPy-style automatic prompt improvement
Caching	Reduce costs 50-90% with built-in caching
Observability	Track costs, tokens, and latency
Streaming	Real-time responses with `stream()` and `astream()`
Multimodal	Images, documents, audio, and video
YAML Prompts	Store prompts in version-controlled files

Structured Outputs

Define your expected output as a Pydantic model. FlowPrompt handles parsing and validation automatically.

from pydantic import BaseModel, Field

class SentimentAnalysis(Prompt):
    system: str = "Analyze the sentiment of the given text."
    user: str = "Text: {text}"

    class Output(BaseModel):
        sentiment: str = Field(description="positive, negative, or neutral")
        confidence: float = Field(ge=0.0, le=1.0)
        keywords: list[str]

result = SentimentAnalysis(text="I love this product!").run(model="gpt-4o")
print(result.sentiment)   # "positive"
print(result.confidence)  # 0.95
print(result.keywords)    # ["love", "product"]

Multi-Provider Support

Switch between providers with a single parameter. No code changes needed.

# OpenAI
result = prompt.run(model="gpt-4o")

# Anthropic Claude
result = prompt.run(model="anthropic/claude-3-5-sonnet-20241022")

# Google Gemini
result = prompt.run(model="gemini/gemini-2.0-flash-exp")

# Local models via Ollama
result = prompt.run(model="ollama/llama3")

Streaming

Get real-time responses for better user experience.

# Synchronous
for chunk in prompt.stream(model="gpt-4o"):
    print(chunk.delta, end="", flush=True)

# Asynchronous
async for chunk in prompt.astream(model="gpt-4o"):
    print(chunk.delta, end="", flush=True)

Caching

Reduce API costs by caching identical requests.

from flowprompt import configure_cache, get_cache

# Enable caching with 1-hour TTL
configure_cache(enabled=True, default_ttl=3600)

# First call hits the API
result1 = MyPrompt(text="hello").run(model="gpt-4o")

# Second identical call uses cache (instant, free)
result2 = MyPrompt(text="hello").run(model="gpt-4o")

# Check performance
print(get_cache().stats)
# {'hits': 1, 'misses': 1, 'hit_rate': 0.5}

Observability

Track costs, tokens, and latency with OpenTelemetry integration.

from flowprompt import get_tracer

result = MyPrompt(text="hello").run(model="gpt-4o")

summary = get_tracer().get_summary()
print(f"Cost: ${summary['total_cost_usd']:.4f}")
print(f"Tokens: {summary['total_tokens']}")
print(f"Latency: {summary['avg_latency_ms']:.0f}ms")

Automatic Optimization

Improve prompts automatically using training data (inspired by DSPy).

from flowprompt.optimize import optimize, ExampleDataset, Example, ExactMatch

# Create training examples
dataset = ExampleDataset([
    Example(input={"text": "John is 25"}, output={"name": "John", "age": 25}),
    Example(input={"text": "Alice is 30"}, output={"name": "Alice", "age": 30}),
])

# Optimize with few-shot examples
result = optimize(
    ExtractUser,
    dataset=dataset,
    metric=ExactMatch(),
    strategy="fewshot",  # or "instruction", "optuna", "bootstrap"
)

print(f"Improved by: {result.best_score:.0%}")
OptimizedPrompt = result.best_prompt_class

A/B Testing

Run controlled experiments to compare prompt variants with statistical significance.

from flowprompt.testing import create_simple_experiment

# Setup experiment
config, runner = create_simple_experiment(
    name="prompt_comparison",
    control_prompt=PromptV1,
    treatment_prompts=[("v2", PromptV2)],
    min_samples=100,
)

runner.start_experiment(config.id)

# Get variant for a user (sticky assignment)
variant = runner.get_variant(config.id, user_id="user123")
result = runner.run_prompt(config.id, variant.name, input_data={"text": "..."})

# Check results
summary = runner.get_summary(config.id)
if summary.winner:
    print(f"Winner: {summary.winner.name}")
    print(f"Effect: {summary.statistical_result.effect_size:+.1%}")

Multimodal Support

Work with images, documents, audio, and video.

from flowprompt.multimodal import VisionPrompt, DocumentPrompt

# Analyze images
class ImageAnalyzer(VisionPrompt):
    system: str = "Describe what you see in the image."
    user: str = "What's in this image?"

result = ImageAnalyzer().with_image("photo.jpg").run(model="gpt-4o")

# Summarize documents
class DocSummarizer(DocumentPrompt):
    system: str = "Summarize documents concisely."
    user: str = "Summarize the key points."

result = DocSummarizer().with_document("report.pdf").run(model="gpt-4o")

YAML Prompts

Store prompts in version-controlled files for team collaboration.

# prompts/extract_user.yaml
name: ExtractUser
version: "1.0.0"
system: You are a precise data extractor.
user: "Extract from: {{ text }}"
output_schema:
  type: object
  properties:
    name: { type: string }
    age: { type: integer }
  required: [name, age]

from flowprompt import load_prompt, load_prompts

# Load single prompt
ExtractUser = load_prompt("prompts/extract_user.yaml")

# Load all prompts from directory
prompts = load_prompts("prompts/")

CLI

Optimize prompts from the command line:

# Optimize a prompt with training examples
flowprompt optimize my_prompt.py examples.json --strategy fewshot

# Output:
# Loading prompt from my_prompt.py...
#   Found: ExtractUser
# Loading examples from examples.json...
#   Loaded 10 examples
# Evaluating baseline...
#   Baseline accuracy: 65.0%
# Optimizing with strategy='fewshot'...
# --------------------------------------------------
# OPTIMIZATION COMPLETE
# --------------------------------------------------
#   Before: 65.0% accuracy
#   After:  89.0% accuracy
#   Change: +24.0%

Other commands:

flowprompt init my-project       # Initialize new project
flowprompt run prompt.yaml       # Run a prompt
flowprompt test                  # Validate prompts
flowprompt stats                 # View usage statistics

Comparison

Feature	FlowPrompt	LangChain	Instructor	DSPy
A/B Testing	Yes	No	No	No
Type-safe prompts	Yes	No	Yes	No
Structured outputs	Yes	Partial	Yes	No
Auto-optimization	Yes	No	No	Yes
Multi-provider	Yes	Yes	Yes	Partial
Caching	Yes	Partial	No	No
Cost tracking	Yes	Partial	No	No
Streaming	Yes	Yes	No	No
YAML prompts	Yes	No	No	No
Import time	<100ms	~2s	<100ms	~6s

Documentation

Quick Start Guide - Get started in 5 minutes
API Reference - Complete API documentation
Optimization Guide - Improve prompts automatically
A/B Testing Guide - Run experiments
Multimodal Guide - Work with images and documents

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

git clone https://github.com/yotambraun/flowprompt.git
cd flowprompt
uv venv && uv sync --all-extras
uv run pytest

License

MIT License - see LICENSE for details.

Made with care by Yotam Braun

GitHub | PyPI | Issues

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Mar 20, 2026

This version

0.2.1

Jan 16, 2026

0.2.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowprompt_ai-0.2.1.tar.gz (120.7 kB view details)

Uploaded Jan 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flowprompt_ai-0.2.1-py3-none-any.whl (72.7 kB view details)

Uploaded Jan 16, 2026 Python 3

File details

Details for the file flowprompt_ai-0.2.1.tar.gz.

File metadata

Download URL: flowprompt_ai-0.2.1.tar.gz
Upload date: Jan 16, 2026
Size: 120.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flowprompt_ai-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`412ce1f76c7342a8f86caeb033d19d291056aece771e84808ed6f8545ba224ba`
MD5	`64fabd856e52980e2b41d936de06a9f9`
BLAKE2b-256	`6da74a395940c1d84a5ca0243cb78d4a24571c71d8e7bb745b5f0192aa573d4c`

See more details on using hashes here.

File details

Details for the file flowprompt_ai-0.2.1-py3-none-any.whl.

File metadata

Download URL: flowprompt_ai-0.2.1-py3-none-any.whl
Upload date: Jan 16, 2026
Size: 72.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flowprompt_ai-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c6c501748c0b258c9bdb4b06848634e0b2f1d9e290f22ff4e811f337f26918ad`
MD5	`5489ffbcd7677a09801436af8eea18a5`
BLAKE2b-256	`13a8206bf88f17b624b8c4ff729689fdb7cc902bf94f64406818e548a78bbc8a`

See more details on using hashes here.

flowprompt-ai 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FlowPrompt

Why FlowPrompt?

Installation

Features at a Glance

Structured Outputs

Multi-Provider Support

Streaming

Caching

Observability

Automatic Optimization

A/B Testing

Multimodal Support

YAML Prompts

CLI

Comparison

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes