A modular evaluation framework for testing functions with YAML-based specifications
Project description
VOWEL
YAML-based evaluation framework for testing Python functions with AI-powered test generation, function healing and TDD approach.
vowel makes it easy to define test cases in YAML and run them against your Python functions. It also provides AI-powered generators that can automatically create test specs, generate implementations, and fix buggy functions.
Installation
pip install vowel
# Or with uv
uv add vowel
Development
git clone https://github.com/fswair/vowel.git
cd vowel
pip install -e ".[all]"
Quick Start
Note:
For a deeper understanding of how vowel handles fixtures, see the examples indb_fixture.ymlanddb.py. These files demonstrate the underlying mechanics of fixture setup and usage.
Tip:
To enable YAML schema validation in your editor, placevowel-schema.jsonin your project directory.
Then, add the following directive at the top of your YAML file to activate schema support and instructions:# yaml-language-server: $schema=<path/to/vowel-schema.json>Replace
<path/to/vowel-schema.json>with the actual path to your schema file.
1. Create a YAML spec
# evals.yml
add:
dataset:
- case:
inputs: { x: 1, y: 2 }
expected: 3
- case:
inputs: { x: -5, y: 5 }
expected: 0
divide:
evals:
Type:
type: "float"
dataset:
- case:
inputs: { a: 10, b: 2 }
expected: 5.0
- case:
inputs: { a: 1, b: 0 }
raises: ZeroDivisionError
2. Run from CLI
vowel evals.yml
3. Or programmatically
from vowel import run_evals
def add(x: int, y: int) -> int:
return x + y
def divide(a: float, b: float) -> float:
return a / b
summary = run_evals("evals.yml", functions={"add": add, "divide": divide})
print(f"All passed: {summary.all_passed}")
print(f"Coverage: {summary.coverage * 100:.1f}%")
4. Or use the fluent API
from vowel import RunEvals
summary = (
RunEvals.from_file("evals.yml")
.with_functions({"add": add, "divide": divide})
.filter(["add"])
.debug()
.run()
)
summary.print()
Features
Evaluators
8 built-in evaluators for flexible testing:
| Evaluator | Purpose |
|---|---|
| Expected | Exact value matching |
| Type | Return type checking (strict/lenient) |
| Assertion | Custom Python expressions (output > 0, output == input * 2) |
| Duration | Performance constraints (function-level & case-level) |
| Pattern | Regex validation on output |
| ContainsInput | Verify output contains the input |
| Raises | Exception class + optional message matching |
| LLMJudge | AI-powered rubric evaluation |
factorial:
evals:
Assertion:
assertion: "output > 0"
Type:
type: "int"
Duration:
duration: 1.0
dataset:
- case: { input: 0, expected: 1 }
- case: { input: 5, expected: 120 }
Full reference: docs/EVALUATORS.md
Fixtures (Dependency Injection)
Inject databases, temp files, caches into functions under test. Three patterns: generator (yield), tuple (setup/teardown), simple (setup only).
fixtures:
db:
setup: myapp.setup_db
teardown: myapp.close_db
scope: module
query_user:
fixture: [db]
dataset:
- case:
inputs: { user_id: 1 }
expected: { name: "Alice" }
def query_user(user_id: int, *, db: dict) -> dict | None:
return db["users"].get(user_id)
Full reference: docs/FIXTURES.md
Input Serializers
Transform YAML inputs into Pydantic models, dates, or custom types:
summary = (
RunEvals.from_file("evals.yml")
.with_functions({"get_user": get_user})
.with_serializer({"get_user": User}) # Schema mode
.run()
)
Full reference: docs/SERIALIZERS.md
AI-Powered Generation
EvalGenerator — test existing functions
from vowel import EvalGenerator, Function
generator = EvalGenerator(model="openai:gpt-4o", load_env=True)
func = Function.from_callable(my_function)
result = generator.generate_and_run(func, auto_retry=True, heal_function=True)
print(f"Coverage: {result.summary.coverage * 100:.1f}%")
TDDGenerator — generate everything from a description
from vowel.tdd import TDDGenerator
generator = TDDGenerator(model="gemini-3-flash-preview", load_env=True)
result = generator.generate_all(
description="Binary search for target in sorted list. Returns index or -1.",
name="binary_search"
)
result.print() # Shows: signature → tests → code → results
Step-by-step control:
description = "Calculate factorial of a non-negative integer"
signature = generator.generate_signature(description=description, name="factorial")
runner, yaml_spec = generator.generate_evals_from_signature(signature, description=description)
func = generator.generate_implementation(signature, yaml_spec, description=description)
summary = runner.with_functions({"factorial": func.impl}).run()
Full reference: docs/AI_GENERATION.md
MCP Server
Expose vowel's capabilities to AI assistants like Claude Desktop via Model Context Protocol.
Setup guide: docs/MCP.md
CLI
vowel evals.yml # Run single file
vowel -d ./tests # Run directory
vowel evals.yml -f add,divide # Filter functions
vowel evals.yml --ci --cov 90 # CI mode
vowel evals.yml --watch # Watch mode
vowel evals.yml --dry-run # Show plan without running
vowel evals.yml --export-json out.json # Export results
vowel evals.yml -v # Verbose summary
vowel evals.yml -v --hide-report # Verbose, hide pydantic_evals report
Full reference: docs/CLI.md
EvalSummary
summary = run_evals("evals.yml", functions={...})
summary.all_passed # bool
summary.success_count # int
summary.failed_count # int
summary.total_count # int
summary.coverage # float (0.0-1.0)
summary.failed_results # list[EvalResult]
summary.meets_coverage(0.9) # Check threshold
summary.print() # Rich formatted output
summary.to_json() # Export as dict
summary.xml() # Export as XML
Documentation
| Document | Description |
|---|---|
| YAML Spec | Complete YAML format reference |
| Evaluators | All 8 evaluator types |
| Fixtures | Dependency injection guide |
| Serializers | Input serializer patterns |
| AI Generation | EvalGenerator & TDDGenerator |
| CLI | Command-line reference |
| MCP Server | AI assistant integration |
| Troubleshooting | Common errors & solutions |
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vowel-0.3.3.tar.gz.
File metadata
- Download URL: vowel-0.3.3.tar.gz
- Upload date:
- Size: 162.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b8aa5f0fdf17c21bb0407b5114ac49348f999515a06dd9a6f25e6ac6c965dcbc
|
|
| MD5 |
c25b6bc900805f5585d407c490663b5e
|
|
| BLAKE2b-256 |
aba7f4de79cd9b968ed82678c54cdc25ec86969b45894c6777aba800d160760b
|
File details
Details for the file vowel-0.3.3-py3-none-any.whl.
File metadata
- Download URL: vowel-0.3.3-py3-none-any.whl
- Upload date:
- Size: 101.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfb3c83bd8aed9718dd6ebd0e87245e48d4aea9a8f18912d23834416373589c1
|
|
| MD5 |
244e8aba4e662cc65e1d5ca3ef1c3ce0
|
|
| BLAKE2b-256 |
7cfd0872a7a23d4090cb0fe5c681960708ebfd6598f8b54c67a5eefda50d8479
|