A Python library to optimize prompt drafts using LLMs
Project description
🧠 leo-prompt-optimizer
leo-prompt-optimizer is a production-grade library and CLI tool that transforms raw prompt drafts into structured, high-performance instructions using a 9-step engineering framework.
Stop "vibes-based" prompting. Use a data-driven approach to optimize, evaluate, and benchmark your prompts across OpenAI, Groq, Anthropic, Gemini, and Mistral.
🌟 Key Features
- ⚡ Lightning Fast: Optimized for high-throughput providers like Groq for near-instant iteration.
- 📊 LLM-as-a-Judge: Built-in G-Eval metrics, hallucination detection, and schema adherence checks.
- 🖥️ Rich CLI: Beautiful terminal reports with side-by-side diffs and performance tables.
- 🧩 XML-Structured Output: Automatically reformats prompts into
<role>,<task>, and<instructions>blocks for better LLM steerability.
📦 Installation
pip install leo-prompt-optimizer
🖥️ CLI: The "Pro" Workflow
Optimize a prompt and immediately benchmark it against test cases to see if it actually performs better.
leo-prompt --prompt-file draft.txt \
--provider-name groq \
--tests tests.json \
--model your-model-id
What happens under the hood?
- Optimization: Your draft is expanded into a structured "System Prompt."
- Execution: Both the Original and Optimized prompts run against your
tests.json. - Evaluation: A "Judge" model compares the outputs and generates a performance report.
🤖 Supported Providers
| Provider | Environment Variable |
|---|---|
| Groq | GROQ_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Gemini | GOOGLE_API_KEY |
| Mistral | MISTRAL_API_KEY |
🔧 Python API Usage
Perfect for integrating prompt optimization into your CI/CD pipelines or internal tools.
1. Initialize Provider
from leo_prompt_optimizer import GeminiProvider, AnthropicProvider, OpenAIProvider, MistralProvider, GroqProvider, LeoOptimizer, PromptEvaluator, BatchEvaluator
# Automatically loads API keys from .env (GROQ_API_KEY, OPENAI_API_KEY, etc.)
provider = GroqProvider() # or another provider, you can specify your base_url for OpenAI is you have one as an argument
optimizer = LeoOptimizer(
provider,
default_model="your-optimizer-model-id"
)
2. Optimize a Prompt
draft = "Write a code review for this python function."
# Optimize your prompt draft
optimized = optimizer.optimize(
draft, # Required - the prompt you want to optimize
user_input_example, # Optional - user input example
llm_output_example, # Optional - output wanted example
top_instruction, # Optional - high level instructions if more specific guidance is needed
model, # Optional - specify the model you want to use
)
print(optimized)
3. Evaluate on a Single Test Input
from rich.console import Console
console = Console()
# Evaluate the optimized prompt on one input compared to the draft prompt
evaluator = PromptEvaluator(provider, optimizer.env, judge_model="your-judge-model-id")
result = evaluator.evaluate(
original_prompt=draft,
optimized_prompt=optimized,
test_input="def add(a, b): return a + b"
)
# Print the evaluation results in a clean and clear table
console.print(result.to_rich_table())
4. Evaluate on a Batch of Test Inputs
test_cases = [
"def fib(n): return n if n <= 1 else fib(n-1) + fib(n-2)",
"def fib(n): a, b = 0, 1\n for _ in range(n): a, b = b, a + b\n return a"
]
batch_evaluator = BatchEvaluator(provider, optimizer.env, judge_model="your-judge-model-id")
batch_result = batch_evaluator.run_batch(
original_prompt=draft,
optimized_prompt=optimized,
test_cases=test_cases
)
# Print the batch evaluation results in a clean and clear table
console.print(batch_result.to_rich_table())
Example tests.json format for the CLI:
[
"def fib(n): return n if n <= 1 else fib(n-1) + fib(n-2)",
"def fib(n): a, b = 0, 1\n for _ in range(n): a, b = b, a + b\n return a"
]
🧪 The Evaluation Framework
The library provides objective scores to replace subjective testing:
| Metric | Description |
|---|---|
| G-Eval (1-5) | A multi-dimensional score for coherence and instruction following. |
| Token Efficiency | Percentage of tokens saved (or added) for the structural improvement. |
| Schema Adherence | Pass/Fail check for structured outputs (JSON/Markdown). |
| Hallucination Risk | Detects if the model is fabricating facts not present in the input. |
| Hallucination Accuracy | Percentage of runs that were hallucination-free (e.g. 99.0% for 1 incident in 100 runs). |
| Total Runs | Number of test cases evaluated, giving context to all other metrics. |
📘 Optimized Format Example
Your raw drafts are transformed into high-signal instructions:
<role>You are a Senior Python Security Auditor...</role>
<task>Analyze the provided function for SQL injection vulnerabilities...</task>
<instructions>
1. Identify all string-formatting operations.
2. Check for missing parameterized queries...
</instructions>
<output-format>Return a JSON object with 'severity' and 'fix'.</output-format>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file leo_prompt_optimizer-1.0.9.tar.gz.
File metadata
- Download URL: leo_prompt_optimizer-1.0.9.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9718d4957d8048b16da6a4e844bf36006fcf8e7b37b723bb058b831c94818a1a
|
|
| MD5 |
977aed35e0d27ecd3ee3ff0568b0c79a
|
|
| BLAKE2b-256 |
b204a1bfbc516a517cd8f47440ec49d25702ed7be5c6d21b61c8bdefbfadb9cb
|
File details
Details for the file leo_prompt_optimizer-1.0.9-py3-none-any.whl.
File metadata
- Download URL: leo_prompt_optimizer-1.0.9-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae9a8bd0d002a026d07a99c4da3b359b4ba9c747004757a71d175e29fce8d376
|
|
| MD5 |
f86d33391dec45a63ecfb7bc059438fa
|
|
| BLAKE2b-256 |
9c06f0764e75e9e36088895888b0c20211ca7b9805ad0ce0194de9170df96c5b
|