Skip to main content

Python SDK for authoring Mira eval studies (protocol over stdio, no Rust dependency).

Project description

mira-eval — Python SDK

Author a Mira eval study in Python and run it with the mira host CLI.

This is not a binding to the Rust core — it's a native, pure-stdlib library that speaks the Mira eval protocol (newline-delimited JSON over stdio). The host owns selection, the model matrix, concurrency, saved runs, and reporting; the study owns subjects and scoring. Any language that speaks the protocol is a first-class study — this SDK just makes the Python side ergonomic.

The protocol layer is generated from the canonical artifacts under schema/v1/ — the same language-neutral contract the Rust host is generated from — so it never drifts from the wire format: mira/_wire.py (wire types, from schema.json) and mira/_meta.py (protocol version, methods, capability tokens, from meta.json).

Use

import mira

study = mira.Study("my-evals", version="0.1.0")

@study.eval(
    samples=[mira.Sample("hi", prompt="Say hi and the answer to life.", tags=["smoke"])],
    targets=[mira.target("sim")],
    scorers=[mira.succeeded(), mira.contains("42")],
)
def greet(sample, cx):
    # A real subject calls a model; route on cx.target / cx.provider.
    return mira.transcript(
        f"Hi! The answer is 42. ({sample.text})",
        usage=mira.Usage(input_tokens=40, output_tokens=8),
    )

if __name__ == "__main__":
    study.serve()

Drive it with the host:

mira --python3 study.py list
mira --python3 study.py run
# run-now, score-later (split execute/score path):
mira --python3 study.py run --execute-only --artifacts art/
mira --python3 study.py score --artifacts art/

--python3 is a convenience launcher; --python and --uv (uv run …) are the equivalents for those interpreters, and --cmd "python3 study.py" still works for an arbitrary command line.

A complete, runnable example lives in examples/greet-python.

API

  • Study(name, version=None, page_size=500) — the registry; @study.eval(...) registers a subject fn(sample, cx) -> Transcript; study.serve() runs the stdio loop (handling initialize/list/list_samples/run/execute/score). page_size paginates large datasets across list + list_samples (0 disables).
  • Sample(id, prompt=…|input=[…], tags=…, expected=…, files=…, metadata=…)sample.text joins the input turns for the subject.
  • target(label, provider="", available=True) — a matrix case (the model or harness under evaluation). An unavailable target is reported as N/A (infra), not a failure.
  • RunCxcx.target, cx.provider, cx.max_turns, cx.param(name).
  • transcript(final_response, usage=…, timing=…, iterations=…, …) and the Usage/Timing types.
  • Scorers: succeeded(), contains(text), equals(text), regex(pattern), and scorer(name, fn) for an arbitrary predicate (return a bool or a fully-formed Score, including na=True).
  • axis(name, values) — an extra matrix axis (crossed with the model matrix).

Develop

python3 codegen.py            # regenerate mira/_wire.py + mira/_meta.py from schema/v1/
python3 codegen.py --check    # fail if either is stale (CI drift guard)
pip install -e .[dev]
python3 -m pytest             # conformance + metadata-coverage + serve-loop tests

The runtime has zero dependencies; jsonschema and pytest are dev-only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mira_eval-0.3.0.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mira_eval-0.3.0-py3-none-any.whl (15.6 kB view details)

Uploaded Python 3

File details

Details for the file mira_eval-0.3.0.tar.gz.

File metadata

  • Download URL: mira_eval-0.3.0.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for mira_eval-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2e66f77ec66a5fcf0bff96aeaa05102448a974b4d00078807af6c2b9279b7864
MD5 0fd4e01cdcbae73b2f17b47b699a3128
BLAKE2b-256 67eac278defe13b8a3bdd45379bfe8da3a9b34a4e6d328bde03275c2f4322629

See more details on using hashes here.

Provenance

The following attestation bundles were made for mira_eval-0.3.0.tar.gz:

Publisher: publish.yml on everruns/mira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mira_eval-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: mira_eval-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 15.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for mira_eval-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8531b3fd10fa19af46ef90de7cde43822f4a719ec1b103c25e8ec6814182176f
MD5 04df980abac1a29b18c64dedd053753a
BLAKE2b-256 6e6fea01b9dfaf32dc105cfae0b0f22f594a7612095c270f2d7f1c5de95514e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for mira_eval-0.3.0-py3-none-any.whl:

Publisher: publish.yml on everruns/mira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page