Evals-first prompt optimization. Label examples, get better prompts.
Project description
coaxer
Label examples. Derive the prompt. Consume it as a string.
Motivation
Writing prompts by hand is slow, and the prose grows brittle as cases accumulate. Coaxer flips it: label examples of the behavior you want, derive the prompt from those labels -- when it drifts, add more labels instead of rewriting.
Labels are the source of truth. The prompt is a build artifact.
Install
uv add coaxer
Label
One directory per record. record.json holds scalar fields; large text and binary inputs live as sibling files.
labels/repo-classification/
_schema.json # optional: field descriptions + types + enums
0001/
record.json # {id, inputs: {readme, stars, ...}, output}
readme.md # large text referenced from record.json
0002/
...
_schema.json is optional. Without it, field names and types are inferred from the records.
{
"inputs": {
"readme": {"desc": "Project README markdown"},
"stars": {"desc": "GitHub star count", "type": "int"}
},
"output": {
"desc": "Curated collection vs organic project",
"type": "enum",
"values": ["true", "false"]
}
}
Distill
coax labels/repo-classification --out prompts/repo-classification
Writes four files to the output folder:
| File | Purpose |
|---|---|
prompt.jinja |
Human-readable Jinja template with {{ field }} slots. |
meta.json |
Compile metadata: compiled_at, example_count, label_hash, schema. |
dspy.json |
DSPy program state (only when --optimizer gepa). |
history.jsonl |
Append-only compile log. |
Optimizer is opt-in. --optimizer gepa runs DSPy 3's GEPA pass and requires an LLM credential. The default (--optimizer none) emits a schema-derived template and is reproducible without network.
Consume
from coaxer import CoaxedPrompt
p = CoaxedPrompt("prompts/repo-classification", role="classifier") # bind defaults
filled = p(readme=new_readme, stars=1200) # render at call time
CoaxedPrompt(path, **bound)—strsubclass;__new__readsprompt.jinja.str(p)— raw template.p(**vars)— Jinja2StrictUndefinedrender; missing variables raise.- Call-time variables override bound defaults.
Because CoaxedPrompt is a str, it drops in anywhere a string is accepted (logging, OpenAI SDK messages, Anthropic SDK, DSPy signatures built externally, etc.).
Compile LLMs
AgentLM routes compile calls through the Anthropic Agent SDK (Claude Code). OpenAILM hits any OpenAI-compatible endpoint (Ollama, vLLM, OpenAI).
from coaxer import AgentLM, OpenAILM
lm = AgentLM() # Claude via Agent SDK
lm = OpenAILM(model="llama3") # Ollama
lm = OpenAILM(model="gpt-4o", base_url="https://api.openai.com/v1", api_key="sk-...")
Both pass keyword arguments through to their underlying client.
Caching
Pass a cachetta instance to file-back LM responses:
from cachetta import Cachetta
from coaxer import AgentLM
cache = Cachetta(path=lambda prompt, **kw: f"cache/{prompt}.pkl", duration="7d")
lm = AgentLM(cache=cache)
Install with the cache extra: uv add "coaxer[cache]".
Development
uv sync --extra dev
uv run just test-unit # Unit tests
uv run just ci # Full CI (lint + format + typecheck + tests)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file coaxer-0.2.16.tar.gz.
File metadata
- Download URL: coaxer-0.2.16.tar.gz
- Upload date:
- Size: 136.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4049cc42efcd65db6c6b08e98d7c2f6750704279da92462c7c598216fe0d026f
|
|
| MD5 |
2d1e0503d01d8ac9b6f6735a8ef4fcb2
|
|
| BLAKE2b-256 |
c961dba72af339945710376811cba2e838dce8316597dcb8221c297702cfad18
|