Test framework for LLM prompt files

These details have not been verified by PyPI

Project description

prompttest

Test framework for LLM prompt files.

What It Does

prompttest is a test framework for LLM prompt files, similar to what Jest or pytest does for application code. You write test suites in YAML that assert properties of your prompt files -- content, structure, token counts, cost limits, and more. Tests run in CI to catch prompt regressions.

Installation

pip install prompttest

Dependencies: prompttools-core >= 1.0, promptcost >= 1.0, typer >= 0.12, pyyaml >= 6.0, rich >= 13.0

CLI Commands

`prompttest run`

Run prompt tests from a file or directory.

# Run a single test file
prompttest run tests/test_greeting.yaml

# Run all test files in a directory
prompttest run tests/

# Run with custom glob pattern
prompttest run tests/ --pattern "check_*.yaml"

# Stop on first failure
prompttest run tests/ --fail-fast

# JSON output
prompttest run tests/ --format json

# JUnit XML output (for CI)
prompttest run tests/ --format junit

# Verbose output
prompttest run tests/ -v

Options:

Option	Default	Description
`--format`, `-f`	`text`	Output format: `text`, `json`, `junit`
`--model`, `-m`	none	Override model for cost/token assertions
`--fail-fast`	`false`	Stop after first failure
`--verbose`, `-v`	`false`	Show detailed output for all tests
`--pattern`, `-p`	`test_*.yaml`	Glob pattern for test file discovery

`prompttest init`

Create an example test file in the current directory.

prompttest init

This creates test_example.yaml with sample test cases you can adapt to your project.

Test File Format

Test files are YAML with this structure:

suite: my-test-suite          # Suite name (optional, defaults to filename)
prompt: prompts/greeting.yaml  # Path to the prompt file (relative to test file)
model: gpt-4o                 # Default model for cost/token assertions (optional)

tests:
  - name: test-name           # Unique test name
    assert: assertion_type    # One of the 15 assertion types below
    # ... assertion-specific parameters

The prompt path is resolved relative to the test file's directory.

Assertion Types

prompttest supports 15 assertion types:

Content Assertions

`contains`

Assert that prompt content contains specific text.

- name: has-greeting-instruction
  assert: contains
  text: "greet the user"
  case_sensitive: false    # optional, default: false

`not_contains`

Assert that prompt content does NOT contain specific text.

- name: no-injection-risk
  assert: not_contains
  text: "ignore previous instructions"

`matches_regex`

Assert that prompt content matches a regular expression.

- name: has-version-tag
  assert: matches_regex
  pattern: "v\\d+\\.\\d+"
  case_sensitive: false

`not_matches_regex`

Assert that prompt content does NOT match a regular expression.

- name: no-hardcoded-urls
  assert: not_matches_regex
  pattern: "https?://api\\.example\\.com"

Structure Assertions

`has_role`

Assert that the prompt has a message with a given role.

- name: has-system-message
  assert: has_role
  role: system

`has_variables`

Assert that the prompt uses specific template variables.

- name: required-variables
  assert: has_variables
  variables:
    - user_name
    - context

`has_metadata`

Assert that the prompt has specific metadata keys.

- name: has-required-metadata
  assert: has_metadata
  keys:
    - model
    - description

`valid_format`

Assert that the prompt file parsed without errors and contains at least one message.

- name: parseable-prompt
  assert: valid_format

Token/Size Assertions

`max_tokens`

Assert that total token count is under a maximum.

- name: within-context-window
  assert: max_tokens
  max: 4096

`min_tokens`

Assert that total token count is above a minimum.

- name: not-too-short
  assert: min_tokens
  min: 50

`max_messages`

Assert that message count is under a maximum.

- name: reasonable-conversation
  assert: max_messages
  max: 10

`min_messages`

Assert that message count is above a minimum.

- name: has-enough-context
  assert: min_messages
  min: 2

`token_ratio`

Assert that the system/user token ratio is within bounds.

- name: balanced-prompt
  assert: token_ratio
  ratio_max: 5.0

The ratio is computed as system_tokens / user_tokens.

Cost Assertions

`max_cost`

Assert that the estimated cost per invocation is under a budget ceiling. Requires a model (set on the test or the suite).

- name: cost-under-budget
  assert: max_cost
  max: 0.05
  model: gpt-4o      # optional if set on suite

Regression Assertions

`content_hash`

Assert that the prompt content SHA256 hash matches an expected value. Detects unexpected prompt changes.

- name: prompt-unchanged
  assert: content_hash
  hash: "a1b2c3d4..."   # omit to record current hash (always passes)

If hash is omitted, the test passes and reports the current hash so you can record it.

Test Options

Each test case supports these common options:

- name: example-test
  assert: contains
  text: "hello"
  skip: true               # Skip this test
  skip_reason: "not ready" # Reason for skipping
  case_sensitive: false     # For text/regex assertions (default: false)
  model: gpt-4o            # Override suite model for this test

Output Formats

Text (default)

Rich-formatted terminal output with colored pass/fail indicators.

Suite: greeting-tests
Prompt: prompts/greeting.yaml

  PASS  has-system-message
  PASS  token-count-reasonable
  FAIL  no-injection-risk
         Content unexpectedly contains 'ignore previous instructions'
  PASS  cost-under-budget

Results:
  3 passed, 1 failed (4 total)
  Duration: 12ms

JSON

prompttest run tests/ --format json

Returns a JSON object with total, passed, failed, errors, skipped, duration_ms, and detailed suites array.

JUnit XML

prompttest run tests/ --format junit

Standard JUnit XML format compatible with CI systems (GitHub Actions, Jenkins, GitLab CI, CircleCI).

Programmatic Usage

from prompttest import (
    load_test_suite,
    run_test_suite,
    run_test_file,
    run_test_directory,
    discover_test_files,
    format_text,
    format_json,
    format_junit,
)

# Run a single test file
report = run_test_file("tests/test_greeting.yaml")
print(f"Passed: {report.passed}/{report.total}")

# Run all tests in a directory
report = run_test_directory("tests/", fail_fast=True, pattern="test_*.yaml")

# Format output
print(format_text(report))
print(format_json(report))
print(format_junit(report))

# Load and run a suite manually
suite = load_test_suite("tests/test_greeting.yaml")
results = run_test_suite(suite, fail_fast=False)
for r in results:
    print(f"{r.test_name}: {r.status.value} - {r.message}")

CI Integration

GitHub Actions

- name: Run prompt tests
  run: prompttest run tests/ --format junit > test-results.xml

- name: Upload test results
  uses: actions/upload-artifact@v4
  with:
    name: prompt-test-results
    path: test-results.xml

GitLab CI

prompt-tests:
  script:
    - pip install prompttest
    - prompttest run tests/ --format junit > report.xml
  artifacts:
    reports:
      junit: report.xml

Exit codes:

Code	Meaning
0	All tests passed (or no tests found)
1	One or more tests failed or errored
2	Path not found

License

MIT License. Author: Scott Converse.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompttest_ai-1.0.0.tar.gz (21.1 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

prompttest_ai-1.0.0-py3-none-any.whl (15.6 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file prompttest_ai-1.0.0.tar.gz.

File metadata

Download URL: prompttest_ai-1.0.0.tar.gz
Upload date: Mar 25, 2026
Size: 21.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prompttest_ai-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`da4b70b9262d5c226b4692e2033ce308fe927300cd5db185a94cb21ceaab6af2`
MD5	`2bb817188a43b0bbefa4de04fd1a2623`
BLAKE2b-256	`343f509bf655309059ddb224ca0450abbfc5cf60ac8143685dee4298c419cf9e`

See more details on using hashes here.

File details

Details for the file prompttest_ai-1.0.0-py3-none-any.whl.

File metadata

Download URL: prompttest_ai-1.0.0-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 15.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prompttest_ai-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a8fe9b7044fbb7b559c7423ed32bb9ed48ce8a1bf42b1390641564dd286f91ce`
MD5	`972b6bdb26c5ee1c710ab177bfb1c096`
BLAKE2b-256	`e88ab1d52236e01e5602883888ce0b11c62d409f39c77acc2a101e71fd4b57c4`

See more details on using hashes here.

prompttest-ai 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

prompttest

What It Does

Installation

CLI Commands

prompttest run

prompttest init

Test File Format

Assertion Types

Content Assertions

contains

not_contains

matches_regex

not_matches_regex

Structure Assertions

has_role

has_variables

has_metadata

valid_format

Token/Size Assertions

max_tokens

min_tokens

max_messages

min_messages

token_ratio

Cost Assertions

max_cost

Regression Assertions

content_hash

Test Options

Output Formats

Text (default)

JSON

JUnit XML

Programmatic Usage

CI Integration

GitHub Actions

GitLab CI

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`prompttest run`

`prompttest init`

`contains`

`not_contains`

`matches_regex`

`not_matches_regex`

`has_role`

`has_variables`

`has_metadata`

`valid_format`

`max_tokens`

`min_tokens`

`max_messages`

`min_messages`

`token_ratio`

`max_cost`

`content_hash`