A framework for optimizing prompts through multi-task evaluation and iterative improvement
Project description
Promptim
Experimental prompt optimization library.
Example:
Clone the repo, then setup:
uv venv
source .venv/bin/activate
uv pip install -e .
python examples/tweet_writer/create_dataset.py
Then run prompt optimization.
promptim --task examples/tweet_writer/config.json --version 1
Create a custom task
Currently, promptim
runs over individual tasks. A task defines the dataset (with train/dev/test splits), initial prompt, evaluators, and other information needed to optimize your prompt.
name: str # The name of the task
description: str = "" # A description of the task (optional)
evaluator_descriptions: dict = field(default_factory=dict) # Descriptions of the evaluation metrics
dataset: str # The name of the dataset to use for the task
initial_prompt: PromptConfig # The initial prompt configuration.
evaluators: list[Callable[[Run, Example], dict]] # List of evaluation functions
system: Optional[SystemType] = None # Optional custom function with signature (current_prompt: ChatPromptTemplate, inputs: dict) -> outputs
Let's walk through the example "tweet writer" task to see what's expected. First, view the config.json file
{
"optimizer": {
"model": {
"model": "claude-3-5-sonnet-20241022",
"max_tokens_to_sample": 8192
}
},
"task": "examples/tweet_writer/task.py:tweet_task"
}
The first part contains confgiuration for the optimizer process. For now, this is a simple configuration for the default (and only) metaprmopt optimizer. You can control which LLM is used via the model
configuration.
The second part is the path to the task file itself. We will review this below.
def multiple_lines(run, example):
"""Evaluate if the tweet contains multiple lines."""
result = run.outputs.get("tweet", "")
score = int("\n" in result)
comment = "Pass" if score == 1 else "Fail"
return {
"key": "multiline",
"score": score,
"comment": comment,
}
tweet_task = dict(
name="Tweet Generator",
dataset="tweet-optim",
initial_prompt={
"identifier": "tweet-generator-example:c39837bd",
},
# See the starting prompt here:
# https://smith.langchain.com/hub/langchain-ai/tweet-generator-example/c39837bd
evaluators=[multiple_lines],
evaluator_descriptions={
"under_180_chars": "Checks if the tweet is under 180 characters. 1 if true, 0 if false.",
"no_hashtags": "Checks if the tweet contains no hashtags. 1 if true, 0 if false.",
"multiline": "Fails if the tweet is not multiple lines. 1 if true, 0 if false. 0 is bad.",
},
)
We've defined a simple evaluator to check that the output spans multiple lines.
We have also selected an initial prompt to optimize. You can check this out in the hub.
By modifying the above values, you can configure your own task.
CLI Arguments
The CLI is experimental.
Usage: promptim [OPTIONS]
Optimize prompts for different tasks.
Options:
--version [1] [required]
--task TEXT Task to optimize. You can pick one off the
shelf or select a path to a config file.
Example: 'examples/tweet_writer/config.json
--batch-size INTEGER Batch size for optimization
--train-size INTEGER Training size for optimization
--epochs INTEGER Number of epochs for optimization
--debug Enable debug mode
--use-annotation-queue TEXT The name of the annotation queue to use. Note:
we will delete the queue whenever you resume
training (on every batch).
--no-commit Do not commit the optimized prompt to the hub
--help Show this message and exit.
We have created a few off-the-shelf tasks:
- tweet: write tweets
- simpleqa: really hard Q&A
- scone: NLI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file promptim-0.0.2rc1.tar.gz
.
File metadata
- Download URL: promptim-0.0.2rc1.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.29
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85959f8f3d169381997701573d39b5e8c471dd173f345aac8ee27d35fb26784c |
|
MD5 | 670f3aa2e627be9798da36c1b0e8f18b |
|
BLAKE2b-256 | b689d3df60b088ef67d01d8d22010e395a2bd31907d7f9713efc7c68d29adfe1 |
File details
Details for the file promptim-0.0.2rc1-py3-none-any.whl
.
File metadata
- Download URL: promptim-0.0.2rc1-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.29
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a5e393ed372a0a7643561f9510252b1d493cbe912a70f66b7c11c7a53f5a0d2d |
|
MD5 | 39b7c18d051c4144feb3083ebfc014a4 |
|
BLAKE2b-256 | ce3b1f59ba27bbd469d3a7e2a870f6680d8bc04f6f70f33243ba1104461bd843 |