evaluateur

synthetic evals for agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

Evaluateur

Synthetic evaluation helper for LLM applications, built around the dimensions → tuples → queries flow described in Hamel Husain's FAQ.

Installation

The project is packaged as a normal Python library. With uv:

uv add evaluateur

Basic usage

Define a Pydantic model that represents the dimensions of your evaluation space, then use the Evaluator to generate options and queries:

from pydantic import BaseModel, Field

from evaluateur import Evaluator, QueryMode, TupleStrategy


class Query(BaseModel):
    payer: str = Field(..., description="insurance payer, like Cigna")
    age: str = Field(..., description="patient age category, like 'adult' or 'pediatric'")
    complexity: str = Field(
        ...,
        description="complexity of the query to account for the edge cases, like 'off-label', 'comorbidities', etc",
    )
    geography: str = Field(..., description="geography indicator, like a zip code, specific state or county")


evaluator = Evaluator(Query, context="Healthcare prior authorization")

# Step 1: generate options for each dimension using Instructor
options = evaluator.generate_options(
    instructions="Focus on common US payers and edge-case clinical scenarios.",
)

# Step 2: turn options into tuples and natural language queries
output = evaluator.generate_queries(
    options=options,
    mode=QueryMode.HYBRID,
    tuple_strategy=TupleStrategy.CROSS_PRODUCT,
    tuple_count=50,
)

for q in output.queries:
    print(q.source_tuple.values, "->", q.query)

The evaluator uses environment variables (for example OPENAI_API_KEY) and supports any provider that instructor supports. You can customise the provider and model via the LLMClient helper if needed.

If your input model already uses iterator fields (for example payer: list[str] = ["Cigna", "Aetna"]), those lists are treated as fixed options and are not modified by generate_options(). Scalar fields of any basic type (str, int, float, and so on) are turned into lists of options automatically.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aptlin

Release history Release notifications | RSS feed

0.3.0

Feb 16, 2026

0.2.0

Jan 9, 2026

This version

0.1.0

Dec 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evaluateur-0.1.0.tar.gz (11.6 kB view details)

Uploaded Dec 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

evaluateur-0.1.0-py3-none-any.whl (16.5 kB view details)

Uploaded Dec 10, 2025 Python 3

File details

Details for the file evaluateur-0.1.0.tar.gz.

File metadata

Download URL: evaluateur-0.1.0.tar.gz
Upload date: Dec 10, 2025
Size: 11.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for evaluateur-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6bbbf4cf94fa322376379713d48e43708c6bf7af64ebb538b25f9cb694f4fd9b`
MD5	`5029b4719d49a2eb1ab1efdaa283e706`
BLAKE2b-256	`67bc6ab7ecf06f62fce57b19913d24fdf7f7d5546613de74418b028439d85319`

See more details on using hashes here.

File details

Details for the file evaluateur-0.1.0-py3-none-any.whl.

File metadata

Download URL: evaluateur-0.1.0-py3-none-any.whl
Upload date: Dec 10, 2025
Size: 16.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for evaluateur-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`19257563265c137c4a55026af4dc516c1c9eaf6b8f7fd40e98d3aaf941c5de20`
MD5	`e128132fdc38c41a07de0c850bb93f7d`
BLAKE2b-256	`968448c67e3cb3a1e211f47a3170fc5087547ea622b16f0a089a37229ffae649`

See more details on using hashes here.

evaluateur 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Evaluateur

Installation

Basic usage

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes