Skip to main content

Python SDK for authoring Mira eval studies (protocol over stdio, no Rust dependency).

Project description

mira-eval — Python SDK

Author a Mira eval study in Python and run it with the mira host CLI.

This is not a binding to the Rust core — it's a native, pure-stdlib library that speaks the Mira eval protocol (newline-delimited JSON over stdio). The host owns selection, the model matrix, concurrency, checkpoints, and reporting; the study owns subjects and scoring. Any language that speaks the protocol is a first-class study — this SDK just makes the Python side ergonomic.

The protocol layer is generated from the canonical artifacts under schema/v1/ — the same language-neutral contract the Rust host is generated from — so it never drifts from the wire format: mira/_wire.py (wire types, from schema.json) and mira/_meta.py (protocol version, methods, capability tokens, from meta.json).

Use

import mira

study = mira.Study("my-evals", version="0.1.0")

@study.eval(
    samples=[mira.Sample("hi", prompt="Say hi and the answer to life.", tags=["smoke"])],
    targets=[mira.target("sim")],
    scorers=[mira.succeeded(), mira.contains("42")],
)
def greet(sample, cx):
    # A real subject calls a model; route on cx.target / cx.provider.
    return mira.transcript(
        f"Hi! The answer is 42. ({sample.text})",
        usage=mira.Usage(input_tokens=40, output_tokens=8),
    )

if __name__ == "__main__":
    study.serve()

Drive it with the host:

mira --cmd "python3 study.py" list
mira --cmd "python3 study.py" run
# run-now, score-later (split execute/score path):
mira --cmd "python3 study.py" run --execute-only --artifacts art/
mira --cmd "python3 study.py" score --artifacts art/

A complete, runnable example lives in examples/greet-python.

API

  • Study(name, version=None, page_size=500) — the registry; @study.eval(...) registers a subject fn(sample, cx) -> Transcript; study.serve() runs the stdio loop (handling initialize/list/list_samples/run/execute/score). page_size paginates large datasets across list + list_samples (0 disables).
  • Sample(id, prompt=…|input=[…], tags=…, expected=…, files=…, metadata=…)sample.text joins the input turns for the subject.
  • target(label, provider="", available=True) — a matrix cell (the model or harness under evaluation). An unavailable target is reported as N/A (infra), not a failure.
  • RunCxcx.target, cx.provider, cx.max_turns, cx.param(name).
  • transcript(final_response, usage=…, timing=…, iterations=…, …) and the Usage/Timing types.
  • Scorers: succeeded(), contains(text), equals(text), regex(pattern), and scorer(name, fn) for an arbitrary predicate (return a bool or a fully-formed Score, including na=True).
  • axis(name, values) — an extra matrix axis (crossed with the model matrix).

Develop

python3 codegen.py            # regenerate mira/_wire.py + mira/_meta.py from schema/v1/
python3 codegen.py --check    # fail if either is stale (CI drift guard)
pip install -e .[dev]
python3 -m pytest             # conformance + metadata-coverage + serve-loop tests

The runtime has zero dependencies; jsonschema and pytest are dev-only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mira_eval-0.1.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mira_eval-0.1.0-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file mira_eval-0.1.0.tar.gz.

File metadata

  • Download URL: mira_eval-0.1.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for mira_eval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d52016e1a89b356e1267ac0f1592f98c52cc33a51f73de574fa4d983a594d427
MD5 2dd3ff46b3996d76a3b5a8415b2e145d
BLAKE2b-256 ddd723a9688dff985b0d35ae83e218d7bd0d80567612b886290862934148b916

See more details on using hashes here.

Provenance

The following attestation bundles were made for mira_eval-0.1.0.tar.gz:

Publisher: publish.yml on everruns/mira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mira_eval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mira_eval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for mira_eval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 050129388a6edb3df4f75f2122fa7ba52ee7cfcfbe2ee143a4a788acbf63b304
MD5 35b4a147b98be8155b3cd786a53d9b15
BLAKE2b-256 fc32a1417e65fb72fcfe496024b05544833f4b0b2b6b21060b517128a55d691e

See more details on using hashes here.

Provenance

The following attestation bundles were made for mira_eval-0.1.0-py3-none-any.whl:

Publisher: publish.yml on everruns/mira

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page