A toolkit for batch LLM API calls driven by YAML configuration.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

promptloom

promptloom is a Python toolkit that turns a prompt template and a YAML config file into a fully managed batch of LLM API calls. You write a Markdown prompt with {{PLACEHOLDER}} slots, declare your tasks and models in YAML, and the toolkit takes care of the rest: prompt assembly, concurrent async execution via LiteLLM, structured JSON extraction, schema & custom validation, a multi-turn correction loop for invalid responses, and detailed YAML reporting — including automatic re-generation configs for any failed runs.

Features

YAML-driven configuration — define experiments, tasks, models, parameters, and settings in a single YAML file.
Arbitrary prompt parameters — use {{PLACEHOLDER}} syntax in prompt templates; parameter values come from the YAML task definition.
File references — prefix parameter values with file: to read content from disk (e.g. file:data/law.txt).
System prompts — optional system-level messages, supporting both literal strings and file: references with placeholder substitution.
Multi-provider support — any model supported by LiteLLM (OpenAI, Anthropic, Google Gemini, Ollama, etc.).
Fully asynchronous — all API calls run concurrently via asyncio with configurable concurrency limits.
Response processing — built-in JSON extraction from raw LLM responses (handles code fences, raw JSON, and heuristic extraction).
Validation pipeline — chain validators (JSON Schema, custom Python functions) to check structured output for correctness.
Multi-turn correction loop — when validation fails, automatically send the error report back to the LLM and retry (configurable number of correction turns).
Three-phase pre-flight checks:
1. Model validation — two-tier approach: first checks litellm's built-in registry (instant, local); if unknown, queries the provider's model-list API (e.g. OpenRouter, Ollama) to confirm availability. Also verifies API key availability.
2. Placeholder validation — ensures every {{PLACEHOLDER}} in each template has a matching parameter; warns about unused parameters.
3. Validation config check — verifies response format, correction prompt existence, schema files, and validator specs.
Structured YAML reports — timestamped report with per-task, per-model status, error details, timing, token usage, and correction attempt counts.
Failed-run config — automatically generates a YAML config containing only the failed (task, model) pairs for convenient re-runs.
Programmatic API — use from Python scripts and Jupyter notebooks with both sync (run_experiment) and async (run_experiment_async) entry points.

Installation

pip install promptloom

For development (clone the repo, then install in editable mode):

git clone https://github.com/Nobulax/promptloom.git
cd promptloom
pip install -e ".[dev]"

Quick start

1. Create a prompt template

Write a Markdown file with {{PLACEHOLDER}} syntax for variable parts:

# Task

{{INSTRUCTION}}

# Document

{{DOCUMENT}}

2. Create a YAML config

experiment:
  name: "My Experiment"

defaults:
  models:
    - "openai/gpt-4o"
  prompt_template: "prompt.md"
  system_prompt: "You are a helpful assistant."
  output_dir: "output/"
  max_completion_tokens: 4000
  timeout: 120
  max_concurrency: 5

tasks:
  - id: "task-1"
    params:
      document: "file:data/input.txt"
      instruction: "Summarize this document."
  - id: "task-2"
    params:
      document: "Inline text content."
      instruction: "Translate this to German."
    models:
      - "openai/gpt-4o"
      - "anthropic/claude-sonnet-4-20250514"

3. Set up API keys

Copy .env.example to .env and fill in your keys:

OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-ant-..."
GEMINI_API_KEY="..."

4. Run

From the command line:

promptloom run config.yaml
promptloom run config.yaml --dry-run       # validate only, no API calls
promptloom run config.yaml --skip-preflight # skip pre-flight checks

Or from Python:

from promptloom import run_experiment

results = run_experiment("config.yaml")

Structured output with validation

For tasks that require structured JSON output, the toolkit provides a complete processing pipeline: extract → validate → correct.

Example config

defaults:
  models:
    - "openai/gpt-4o"
  prompt_template: "prompt.md"
  system_prompt: "You are a helpful assistant that responds with valid JSON."
  response_format: "json"
  validators:
    - type: json_schema
      schema: "schemas/output_schema.json"
  correction_prompt: "correction.md"
  max_corrections: 3

tasks:
  - id: "generate-data"
    params:
      instruction: "Generate a structured summary."
      document: "file:data/input.txt"

Correction prompt template

The correction prompt is a Markdown file with a {{ERROR}} placeholder:

Your previous response was invalid. Here is the error report:

{{ERROR}}

Please correct your output. Return only the valid JSON object.

Custom validators

For domain-specific checks (e.g., data integrity, allowed labels, no duplicate IDs), write a Python function and reference it by its import path:

validators:
  - type: json_schema
    schema: "schemas/output.json"
  - type: custom
    callable: "mypackage.validators.check_integrity"

The function must have this signature:

from promptloom.validation import ValidationResult

def check_integrity(data, context):
    """
    data:    the parsed response (e.g., dict from JSON)
    context: {"task_id": ..., "params": {...}, "model": ..., "attempt": ...}
    """
    errors = []
    ids = [item["id"] for item in data["items"]]
    if len(ids) != len(set(ids)):
        errors.append("Duplicate IDs found")
    
    if errors:
        return ValidationResult.fail("\n".join(errors))
    return ValidationResult.ok(data)

The error string from ValidationResult.fail() is substituted into the {{ERROR}} placeholder of the correction prompt and sent back to the LLM.

How the correction loop works

LLM responds → response processor runs (e.g., JSON extraction).
Validators run in order. First failure stops the chain.
If validation fails and corrections remain:
- The assistant's response is appended to the conversation.
- A correction prompt with {{ERROR}} filled in is appended.
- The LLM is called again with the full conversation history.
Repeat up to max_corrections times.
If all corrections are exhausted, the task is marked as failed (but the last output is still saved for inspection).

YAML config reference

A fully-commented reference config showing every available field is at examples/config_full.yaml.

`experiment` (optional)

Field	Type	Description
`name`	string	Human-readable experiment name.
`description`	string	Longer description.

`defaults` (optional)

Global defaults applied to all tasks unless overridden per-task.

Field	Type	Default	Description
`models`	list	—	LiteLLM model identifiers.
`prompt_template`	string	—	Path to the prompt template file.
`system_prompt`	string	`null`	System message (literal or `file:` reference).
`output_dir`	string	`output`	Base output directory.
`max_completion_tokens`	int	`64000`	Max tokens in LLM response.
`timeout`	int	`null`	Timeout in seconds per API call.
`max_concurrency`	int	`10`	Max parallel API calls.
`ignore_unused_params`	bool	`false`	Auto-continue on unused-param warnings.
`response_format`	string	`text`	Response processor: `"text"` or `"json"`.
`validators`	list	`[]`	Ordered list of validator specs.
`correction_prompt`	string	`null`	Path to correction prompt template (needs `{{ERROR}}`).
`max_corrections`	int	`0`	Max correction turns on validation failure.

`tasks` (required)

List of task objects. Each task defines one prompt sent to one or more models. All defaults fields can be overridden per-task.

Field	Type	Required	Description
`id`	string	yes	Unique task identifier.
`params`	dict	no	Key-value pairs substituted into the prompt template.
`models`	list	no	Override default models for this task.
`prompt_template`	string	no	Override default prompt template.
`system_prompt`	string	no	Override default system prompt.
`output_dir`	string	no	Override output directory.
`max_completion_tokens`	int	no	Override max tokens.
`timeout`	int	no	Override timeout.
`response_format`	string	no	Override response processor.
`validators`	list	no	Override validators.
`correction_prompt`	string	no	Override correction prompt template.
`max_corrections`	int	no	Override max corrections.

Parameter values

Parameter values in params are strings by default. To include file contents, prefix the value with file::

params:
  document: "file:data/input.txt"     # reads file content
  instruction: "Summarize this."       # literal string

File paths are resolved relative to the YAML config file's directory.

Output structure

output/
  task-1/
    task-1_openai_gpt-4o.txt           # text format
  task-2/
    task-2_openai_gpt-4o.json          # json format (pretty-printed)
    task-2_anthropic_claude-sonnet-4-20250514.json
  config_report_20260319_143000.yaml    # timestamped report
  config_failed.yaml                    # only if there were failures

Pre-flight checks

Before dispatching API calls, three checks run:

Model validation — uses a two-tier approach. First, litellm's built-in model registry is checked (instant, local). If the model is not in litellm's static registry (common for aggregator providers like OpenRouter), a lightweight GET /v1/models call is made to the provider to confirm the model actually exists. This remote check is free (no tokens consumed), fast, and cached per provider. Also verifies that the required API keys / environment variables are set.
Placeholder validation — for each task, verifies that every {{PLACEHOLDER}} in the prompt template has a matching key in the task's params. Missing params are fatal errors. Unused params are warnings (auto-continued if ignore_unused_params: true).
Validation config — checks that response_format is valid, correction prompt files exist and contain {{ERROR}}, schema files exist, and validator specs are well-formed.

All checks run to completion before any abort decision, so you see all problems at once.

Programmatic API

from promptloom import load_config, run_experiment, run_experiment_async
from promptloom.validation import ValidationResult

# Synchronous (from scripts)
results = run_experiment("config.yaml")

# Async (from notebooks or async code)
config = load_config("config.yaml")
results = await run_experiment_async(config, skip_preflight=True)

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Nobulax

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Apr 15, 2026

0.2.2

Apr 15, 2026

0.2.1

Apr 13, 2026

0.2.0

Apr 13, 2026

0.1.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptloom-0.3.0.tar.gz (41.3 kB view details)

Uploaded Apr 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

promptloom-0.3.0-py3-none-any.whl (32.7 kB view details)

Uploaded Apr 15, 2026 Python 3

File details

Details for the file promptloom-0.3.0.tar.gz.

File metadata

Download URL: promptloom-0.3.0.tar.gz
Upload date: Apr 15, 2026
Size: 41.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptloom-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`770d40d7493881f7475d5f07f1a51ecad6add5f237e33355809d664b6f05c1f2`
MD5	`0b79f03400e34d560aeb44e9b313eda4`
BLAKE2b-256	`dba6ca670f28370166dc2018427834eeb26dea6319eb51a03973d91b35d3609f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptloom-0.3.0.tar.gz:

Publisher: publish.yml on Nobulax/promptloom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptloom-0.3.0.tar.gz
- Subject digest: 770d40d7493881f7475d5f07f1a51ecad6add5f237e33355809d664b6f05c1f2
- Sigstore transparency entry: 1309391952
- Sigstore integration time: Apr 15, 2026
Source repository:
- Permalink: Nobulax/promptloom@7edbaa29ce5c7527c2d8e1933e5427b4b055774a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/Nobulax
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7edbaa29ce5c7527c2d8e1933e5427b4b055774a
- Trigger Event: push

File details

Details for the file promptloom-0.3.0-py3-none-any.whl.

File metadata

Download URL: promptloom-0.3.0-py3-none-any.whl
Upload date: Apr 15, 2026
Size: 32.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for promptloom-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`59e2dc52d17dcf1943942b63e05635b5d2ce33f4fbb249066ae01f450551a484`
MD5	`e104264be7bde3d298bc1de61e1449d7`
BLAKE2b-256	`ff4d48bf4a0afd472bf296c8392a1e194fac5d17906b4fd0f3fe23139345e9c1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptloom-0.3.0-py3-none-any.whl:

Publisher: publish.yml on Nobulax/promptloom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: promptloom-0.3.0-py3-none-any.whl
- Subject digest: 59e2dc52d17dcf1943942b63e05635b5d2ce33f4fbb249066ae01f450551a484
- Sigstore transparency entry: 1309392013
- Sigstore integration time: Apr 15, 2026
Source repository:
- Permalink: Nobulax/promptloom@7edbaa29ce5c7527c2d8e1933e5427b4b055774a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/Nobulax
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@7edbaa29ce5c7527c2d8e1933e5427b4b055774a
- Trigger Event: push

promptloom 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

promptloom

Features

Installation

Quick start

1. Create a prompt template

2. Create a YAML config

3. Set up API keys

4. Run

Structured output with validation

Example config

Correction prompt template

Custom validators

How the correction loop works

YAML config reference

experiment (optional)

defaults (optional)

tasks (required)

Parameter values

Output structure

Pre-flight checks

Programmatic API

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`experiment` (optional)

`defaults` (optional)

`tasks` (required)