Skip to main content

Evals-first prompt optimization. Label examples, get better prompts.

Project description

coaxer

Label examples. Derive the prompt. Consume it as a string.

Documentation

Motivation

Writing prompts by hand is slow, and the prose grows brittle as cases accumulate. Coaxer flips it: label examples of the behavior you want, derive the prompt from those labels -- when it drifts, add more labels instead of rewriting.

Labels are the source of truth. The prompt is a build artifact.

Install

uv add coaxer

Label

One directory per record. record.json holds scalar fields; large text and binary inputs live as sibling files.

labels/repo-classification/
  _schema.json              # optional: field descriptions + types + enums
  0001/
    record.json             # {id, inputs: {readme, stars, ...}, output}
    readme.md               # large text referenced from record.json
  0002/
    ...

_schema.json is optional. Without it, field names and types are inferred from the records.

{
  "inputs": {
    "readme": {"desc": "Project README markdown"},
    "stars": {"desc": "GitHub star count", "type": "int"}
  },
  "output": {
    "desc": "Curated collection vs organic project",
    "type": "enum",
    "values": ["true", "false"]
  }
}

Distill

coax labels/repo-classification --out prompts/repo-classification

Writes four files to the output folder:

File Purpose
prompt.jinja Human-readable Jinja template with {{ field }} slots.
meta.json Compile metadata: compiled_at, example_count, label_hash, schema.
dspy.json DSPy program state (only when --optimizer gepa).
history.jsonl Append-only compile log.

Optimizer is opt-in. --optimizer gepa runs DSPy 3's GEPA pass and requires an LLM credential. The default (--optimizer none) emits a schema-derived template and is reproducible without network.

Consume

from coaxer import CoaxedPrompt

p = CoaxedPrompt("prompts/repo-classification", role="classifier")  # bind defaults
filled = p(readme=new_readme, stars=1200)                         # render at call time
  • CoaxedPrompt(path, **bound)str subclass; __new__ reads prompt.jinja.
  • str(p) — raw template.
  • p(**vars) — Jinja2 StrictUndefined render; missing variables raise.
  • Call-time variables override bound defaults.

Because CoaxedPrompt is a str, it drops in anywhere a string is accepted (logging, OpenAI SDK messages, Anthropic SDK, DSPy signatures built externally, etc.).

Compile LLMs

AgentLM routes compile calls through the Anthropic Agent SDK (Claude Code). OpenAILM hits any OpenAI-compatible endpoint (Ollama, vLLM, OpenAI).

from coaxer import AgentLM, OpenAILM

lm = AgentLM()                                # Claude via Agent SDK
lm = OpenAILM(model="llama3")                 # Ollama
lm = OpenAILM(model="gpt-4o", base_url="https://api.openai.com/v1", api_key="sk-...")

Both pass keyword arguments through to their underlying client.

Caching

Pass a cachetta instance to file-back LM responses:

from cachetta import Cachetta
from coaxer import AgentLM

cache = Cachetta(path=lambda prompt, **kw: f"cache/{prompt}.pkl", duration="7d")
lm = AgentLM(cache=cache)

Install with the cache extra: uv add "coaxer[cache]".

Development

uv sync --extra dev
uv run just test-unit   # Unit tests
uv run just ci          # Full CI (lint + format + typecheck + tests)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coaxer-0.2.13.dev14.tar.gz (135.9 kB view details)

Uploaded Source

File details

Details for the file coaxer-0.2.13.dev14.tar.gz.

File metadata

  • Download URL: coaxer-0.2.13.dev14.tar.gz
  • Upload date:
  • Size: 135.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for coaxer-0.2.13.dev14.tar.gz
Algorithm Hash digest
SHA256 c6cb0c38021f98dba358052ba42d5382aa3f53b347d1c70d498705498a1cddd5
MD5 b25c20ee359df0b4ddd14690826b2857
BLAKE2b-256 11b652d953a0d636d0655c18b9144de97d8d8265e25b54b28a0791f0236697bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page