A modular evaluation framework for testing functions with YAML-based specifications

These details have not been verified by PyPI

Project links

Project description

VOWEL

YAML-based evaluation framework for testing Python functions with AI-powered test generation, function healing and TDD approach.

vowel makes it easy to define test cases in YAML and run them against your Python functions. It also provides AI-powered generators that can automatically create test specs, generate implementations, and fix buggy functions.

Installation

pip install vowel

# Or with uv
uv add vowel

Optional Dependencies

Vowel supports several optional dependency groups for enhanced functionality:

Group	Install Command	Purpose / Extras
all	`pip install vowel[all]`	All optional features
dev	`pip install vowel[dev]`	Development & testing tools
mcp	`pip install vowel[mcp]`	MCP server
optimization	`pip install vowel[optimize]`	Performance optimizations
monty	`pip install vowel[monty]`	Monty runtime support
logfire	`pip install vowel[logfire]`	Logfire integration

Tip:
You can install multiple extras at once, e.g.
pip install vowel[dev,mcp] Recommended: pip install vowel[all]

Development

git clone https://github.com/fswair/vowel.git
cd vowel
pip install -e ".[all]"

Quick Start

Note:
For a deeper understanding of how vowel handles fixtures, see the examples in examples/db_fixtures. These example demonstrate the underlying mechanics of fixture setup and usage.

Tip:
To enable YAML schema validation in your editor, place vowel-schema.json in your project directory.
Then, add the following directive at the top of your YAML file to activate schema support and instructions:
# yaml-language-server: $schema=<path/to/vowel-schema.json>
Replace <path/to/vowel-schema.json> with the actual path to your schema file.

1. Create a YAML spec

# evals.yml
add:
  dataset:
    - case:
        inputs: { x: 2, y: 2 }
        expected: 4
    - case:
        inputs: { x: -5, y: 5 }
        expected: 0

divide:
  evals:
    Type:
      type: "float"
  dataset:
    - case:
        inputs: { a: 10, b: 2 }
        expected: 5.0
    - case:
        inputs: { a: 1, b: 0 }
        raises: ZeroDivisionError

2. Run from CLI

vowel evals.yml

3. Or programmatically

from vowel import run_evals

def add(x: int, y: int) -> int:
    return x + y

def divide(a: float, b: float) -> float:
    return a / b

summary = run_evals("evals.yml", functions={"add": add, "divide": divide})
print(f"All passed: {summary.all_passed}")
print(f"Coverage: {summary.coverage * 100:.1f}%")

4. Or use the fluent API

from vowel import RunEvals

summary = (
    RunEvals.from_file("evals.yml")
    .with_functions({"add": add, "divide": divide})
    .filter(["add"])
    .debug()
    .run()
)

summary.print()

Name matching note: If your YAML uses module.function, programmatic mappings can use either the exact key (module.function) or the short function name (function) in .with_functions(...).

Features

Evaluators

8 built-in evaluators for flexible testing:

Evaluator	Purpose
Expected	Exact value matching
Type	Return type checking (strict/lenient)
Assertion	Custom Python expressions (`output > 0`, `output == input * 2`)
Duration	Performance constraints (function-level & case-level)
Pattern	Regex validation on output
ContainsInput	Verify output contains the input
Raises	Exception class + optional message matching
LLMJudge	AI-powered rubric evaluation

factorial:
  evals:
    Assertion:
      assertion: "output > 0"
    Type:
      type: "int"
    Duration:
      duration: 1.0
  dataset:
    - case: { input: 0, expected: 1 }
    - case: { input: 5, expected: 120 }

Full reference: docs/EVALUATORS.md

Fixtures (Dependency Injection)

Inject databases, temp files, caches into functions under test. Three patterns: generator (yield), tuple (setup/teardown), simple (setup only).

fixtures:
  db:
    setup: myapp.setup_db
    teardown: myapp.close_db
    scope: module

query_user:
  fixture: [db]
  dataset:
    - case:
        inputs: { user_id: 1 }
        expected: { name: "Alice" }

def query_user(user_id: int, *, db: dict) -> dict | None:
    return db["users"].get(user_id)

Fixture scope aliases:

Preferred scope names: case, eval, file
Backward-compatible aliases: function, module, session
Normalization mapping: case -> function, eval -> module, file -> session

Example:

fixtures:
  temp_data:
    setup: myapp.make_temp_data
    scope: case

  db:
    setup: myapp.setup_db
    teardown: myapp.close_db
    scope: eval

  cache:
    setup: myapp.setup_cache
    scope: file

Full reference: docs/FIXTURES.md

Input Serializers

Transform YAML inputs into Pydantic models, dates, or custom types:

summary = (
    RunEvals.from_file("evals.yml")
    .with_functions({"get_user": get_user})
    .with_serializer({"get_user": User})      # Schema mode
    .run()
)

Serializer key matching: Serializer mappings follow the same rule as .with_functions(...) — both module.function and short function keys are accepted.

Assertion context and serializers: When a serializer is configured, assertion evaluators use the serialized value for input (not raw YAML). This applies to schema mode, serial_fn, and nested/dict schemas.

Runnable example (YAML-native serializers + fixtures):

vowel examples/serializers/db_query_evals.yml

This example demonstrates:

top-level serializers: registry with both schema and serializer entries,
per-eval serializer: references,
fixture class lifecycle wiring with cls + teardown,
assertion checks that read serialized input values.

See:

examples/serializers/db_query_evals.yml
examples/serializers/util.py

Full reference: docs/SERIALIZERS.md

AI-Powered Generation

EvalGenerator — test existing functions

from vowel import EvalGenerator, Function

generator = EvalGenerator(model="openai:gpt-4o", load_env=True)
func = Function.from_callable(my_function)

result = generator.generate_and_run(func, auto_retry=True, heal_function=True)
print(f"Coverage: {result.summary.coverage * 100:.1f}%")

TDDGenerator — generate everything from a description

from vowel.tdd import TDDGenerator

generator = TDDGenerator(model="gemini-3-flash-preview", load_env=True)

result = generator.generate_all(
    description="Binary search for target in sorted list. Returns index or -1.",
    name="binary_search"
)

result.print()  # Shows: signature → tests → code → results

Step-by-step control:

description = "Calculate factorial of a non-negative integer"
signature = generator.generate_signature(description=description, name="factorial")
runner, yaml_spec = generator.generate_evals_from_signature(signature, description=description)
func = generator.generate_implementation(signature, yaml_spec, description=description)
summary = runner.with_functions({"factorial": func.impl}).run()

Full reference: docs/AI_GENERATION.md

MCP Server

Expose vowel's capabilities to AI assistants like Claude Desktop via Model Context Protocol.

Setup guide: docs/MCP.md

CLI

vowel evals.yml                          # Run single file
vowel -d ./tests                         # Run directory
vowel evals.yml -f add,divide            # Filter functions
vowel evals.yml --ci --cov 90         # CI mode
vowel evals.yml --watch                  # Watch mode
vowel evals.yml --dry-run                # Show plan without running
vowel evals.yml --export-json out.json   # Export results
vowel evals.yml -v                       # Verbose summary
vowel evals.yml -v --hide-report         # Verbose, hide pydantic_evals report
vowel schema examples/serializers/db_query_evals.yml   # Validate + update schema header
vowel schema --create                                   # Generate vowel-schema.json
vowel costs --list                                      # List tracked generation/run costs

Full reference: docs/CLI.md

EvalSummary

summary = run_evals("evals.yml", functions={...})

summary.all_passed       # bool
summary.success_count    # int
summary.failed_count     # int
summary.total_count      # int
summary.coverage         # float (0.0-1.0)
summary.failed_results   # list[EvalResult]

summary.meets_coverage(0.9)    # Check threshold
summary.print()                # Rich formatted output
summary.to_json()              # Export as dict
summary.xml()                  # Export as XML

Documentation

Document	Description
YAML Spec	Complete YAML format reference
Evaluators	All 8 evaluator types
Fixtures	Dependency injection guide
Serializers	Input serializer patterns
AI Generation	EvalGenerator & TDDGenerator
CLI	Command-line reference
MCP Server	AI assistant integration
Troubleshooting	Common errors & solutions

License

Apache License 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.4.0

Mar 19, 2026

0.3.5

Feb 28, 2026

0.3.4

Feb 28, 2026

0.3.3

Feb 25, 2026

0.3.2

Feb 23, 2026

0.3.1

Feb 11, 2026

0.3.0

Feb 11, 2026

0.3.0b0 pre-release

Feb 11, 2026

0.2.6

Dec 1, 2025

0.2.5

Nov 29, 2025

0.2.4

Nov 16, 2025

0.2.3

Nov 16, 2025

0.2.2

Nov 15, 2025

0.2.1

Nov 15, 2025

0.2.0

Nov 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vowel-0.4.0.tar.gz (231.9 kB view details)

Uploaded Mar 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vowel-0.4.0-py3-none-any.whl (132.7 kB view details)

Uploaded Mar 19, 2026 Python 3

File details

Details for the file vowel-0.4.0.tar.gz.

File metadata

Download URL: vowel-0.4.0.tar.gz
Upload date: Mar 19, 2026
Size: 231.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for vowel-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`2d1dd3b0529d62f68064eebb0dd72d1b8e10c5d4eccfca34117b75130aa94ea8`
MD5	`28ba726b8ec8f0012feb8eb721be6db9`
BLAKE2b-256	`8a91d62e7619749aa01739a2e503217e7d0c03930ae0bc3d9b989025a3f8d35e`

See more details on using hashes here.

File details

Details for the file vowel-0.4.0-py3-none-any.whl.

File metadata

Download URL: vowel-0.4.0-py3-none-any.whl
Upload date: Mar 19, 2026
Size: 132.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for vowel-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`362e2bb42185693b3e21331c10e5b3782b1ab7fd05f50cab18267f2a71ac4d6e`
MD5	`7e59794553badf4d1e9faf28b28b8d66`
BLAKE2b-256	`5a6bd6ebed0ce36027e515d28cb3d4f1c0c3d5a0d07de852a418e681fd4db3a2`

See more details on using hashes here.

vowel 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

VOWEL

Installation

Optional Dependencies

Development

Quick Start

1. Create a YAML spec

2. Run from CLI

3. Or programmatically

4. Or use the fluent API

Features

Evaluators

Fixtures (Dependency Injection)

Input Serializers

AI-Powered Generation

EvalGenerator — test existing functions

TDDGenerator — generate everything from a description

MCP Server

CLI

EvalSummary

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes