Skip to main content

Test framework for LLM prompt files

Project description

prompttest

Test framework for LLM prompt files.

PyPI License: MIT Python 3.9+

What It Does

prompttest is a test framework for LLM prompt files, similar to what Jest or pytest does for application code. You write test suites in YAML that assert properties of your prompt files -- content, structure, token counts, cost limits, and more. Tests run in CI to catch prompt regressions.

Installation

pip install prompttest

Dependencies: prompttools-core >= 1.0, promptcost >= 1.0, typer >= 0.12, pyyaml >= 6.0, rich >= 13.0

CLI Commands

prompttest run

Run prompt tests from a file or directory.

# Run a single test file
prompttest run tests/test_greeting.yaml

# Run all test files in a directory
prompttest run tests/

# Run with custom glob pattern
prompttest run tests/ --pattern "check_*.yaml"

# Stop on first failure
prompttest run tests/ --fail-fast

# JSON output
prompttest run tests/ --format json

# JUnit XML output (for CI)
prompttest run tests/ --format junit

# Verbose output
prompttest run tests/ -v

Options:

Option Default Description
--format, -f text Output format: text, json, junit
--model, -m none Override model for cost/token assertions
--fail-fast false Stop after first failure
--verbose, -v false Show detailed output for all tests
--pattern, -p test_*.yaml Glob pattern for test file discovery

prompttest init

Create an example test file in the current directory.

prompttest init

This creates test_example.yaml with sample test cases you can adapt to your project.

Test File Format

Test files are YAML with this structure:

suite: my-test-suite          # Suite name (optional, defaults to filename)
prompt: prompts/greeting.yaml  # Path to the prompt file (relative to test file)
model: gpt-4o                 # Default model for cost/token assertions (optional)

tests:
  - name: test-name           # Unique test name
    assert: assertion_type    # One of the 15 assertion types below
    # ... assertion-specific parameters

The prompt path is resolved relative to the test file's directory.

Assertion Types

prompttest supports 15 assertion types:

Content Assertions

contains

Assert that prompt content contains specific text.

- name: has-greeting-instruction
  assert: contains
  text: "greet the user"
  case_sensitive: false    # optional, default: false

not_contains

Assert that prompt content does NOT contain specific text.

- name: no-injection-risk
  assert: not_contains
  text: "ignore previous instructions"

matches_regex

Assert that prompt content matches a regular expression.

- name: has-version-tag
  assert: matches_regex
  pattern: "v\\d+\\.\\d+"
  case_sensitive: false

not_matches_regex

Assert that prompt content does NOT match a regular expression.

- name: no-hardcoded-urls
  assert: not_matches_regex
  pattern: "https?://api\\.example\\.com"

Structure Assertions

has_role

Assert that the prompt has a message with a given role.

- name: has-system-message
  assert: has_role
  role: system

has_variables

Assert that the prompt uses specific template variables.

- name: required-variables
  assert: has_variables
  variables:
    - user_name
    - context

has_metadata

Assert that the prompt has specific metadata keys.

- name: has-required-metadata
  assert: has_metadata
  keys:
    - model
    - description

valid_format

Assert that the prompt file parsed without errors and contains at least one message.

- name: parseable-prompt
  assert: valid_format

Token/Size Assertions

max_tokens

Assert that total token count is under a maximum.

- name: within-context-window
  assert: max_tokens
  max: 4096

min_tokens

Assert that total token count is above a minimum.

- name: not-too-short
  assert: min_tokens
  min: 50

max_messages

Assert that message count is under a maximum.

- name: reasonable-conversation
  assert: max_messages
  max: 10

min_messages

Assert that message count is above a minimum.

- name: has-enough-context
  assert: min_messages
  min: 2

token_ratio

Assert that the system/user token ratio is within bounds.

- name: balanced-prompt
  assert: token_ratio
  ratio_max: 5.0

The ratio is computed as system_tokens / user_tokens.

Cost Assertions

max_cost

Assert that the estimated cost per invocation is under a budget ceiling. Requires a model (set on the test or the suite).

- name: cost-under-budget
  assert: max_cost
  max: 0.05
  model: gpt-4o      # optional if set on suite

Regression Assertions

content_hash

Assert that the prompt content SHA256 hash matches an expected value. Detects unexpected prompt changes.

- name: prompt-unchanged
  assert: content_hash
  hash: "a1b2c3d4..."   # omit to record current hash (always passes)

If hash is omitted, the test passes and reports the current hash so you can record it.

Test Options

Each test case supports these common options:

- name: example-test
  assert: contains
  text: "hello"
  skip: true               # Skip this test
  skip_reason: "not ready" # Reason for skipping
  case_sensitive: false     # For text/regex assertions (default: false)
  model: gpt-4o            # Override suite model for this test

Output Formats

Text (default)

Rich-formatted terminal output with colored pass/fail indicators.

Suite: greeting-tests
Prompt: prompts/greeting.yaml

  PASS  has-system-message
  PASS  token-count-reasonable
  FAIL  no-injection-risk
         Content unexpectedly contains 'ignore previous instructions'
  PASS  cost-under-budget

Results:
  3 passed, 1 failed (4 total)
  Duration: 12ms

JSON

prompttest run tests/ --format json

Returns a JSON object with total, passed, failed, errors, skipped, duration_ms, and detailed suites array.

JUnit XML

prompttest run tests/ --format junit

Standard JUnit XML format compatible with CI systems (GitHub Actions, Jenkins, GitLab CI, CircleCI).

Programmatic Usage

from prompttest import (
    load_test_suite,
    run_test_suite,
    run_test_file,
    run_test_directory,
    discover_test_files,
    format_text,
    format_json,
    format_junit,
)

# Run a single test file
report = run_test_file("tests/test_greeting.yaml")
print(f"Passed: {report.passed}/{report.total}")

# Run all tests in a directory
report = run_test_directory("tests/", fail_fast=True, pattern="test_*.yaml")

# Format output
print(format_text(report))
print(format_json(report))
print(format_junit(report))

# Load and run a suite manually
suite = load_test_suite("tests/test_greeting.yaml")
results = run_test_suite(suite, fail_fast=False)
for r in results:
    print(f"{r.test_name}: {r.status.value} - {r.message}")

CI Integration

GitHub Actions

- name: Run prompt tests
  run: prompttest run tests/ --format junit > test-results.xml

- name: Upload test results
  uses: actions/upload-artifact@v4
  with:
    name: prompt-test-results
    path: test-results.xml

GitLab CI

prompt-tests:
  script:
    - pip install prompttest
    - prompttest run tests/ --format junit > report.xml
  artifacts:
    reports:
      junit: report.xml

Exit codes:

Code Meaning
0 All tests passed (or no tests found)
1 One or more tests failed or errored
2 Path not found

License

MIT License. Author: Scott Converse.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompttest_ai-1.0.0.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prompttest_ai-1.0.0-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file prompttest_ai-1.0.0.tar.gz.

File metadata

  • Download URL: prompttest_ai-1.0.0.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prompttest_ai-1.0.0.tar.gz
Algorithm Hash digest
SHA256 da4b70b9262d5c226b4692e2033ce308fe927300cd5db185a94cb21ceaab6af2
MD5 2bb817188a43b0bbefa4de04fd1a2623
BLAKE2b-256 343f509bf655309059ddb224ca0450abbfc5cf60ac8143685dee4298c419cf9e

See more details on using hashes here.

File details

Details for the file prompttest_ai-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: prompttest_ai-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for prompttest_ai-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8fe9b7044fbb7b559c7423ed32bb9ed48ce8a1bf42b1390641564dd286f91ce
MD5 972b6bdb26c5ee1c710ab177bfb1c096
BLAKE2b-256 e88ab1d52236e01e5602883888ce0b11c62d409f39c77acc2a101e71fd4b57c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page