Inephany client library to use Metrana.

These details have not been verified by PyPI

Project description

Metrana Client Library

Metrana is a metrics tracking client for ML/RL training runs. It provides a simple API to log metrics from training loops to the Metrana ingestion service, with asynchronous batching, configurable backpressure handling, and automatic retry on failure.

Installation

pip install metrana

The metrana-protobuf dependency is pulled in automatically.

To use metrana.log_rendering() for logging RL environment video, install the optional rendering extra (pulls in PyAV, used for client-side H.264 encoding):

pip install 'metrana[rendering]'

Quick Start

import metrana

metrana.init(
    api_key="your-api-key",
    workspace_name="my-workspace",
    project_name="my-project",
    run_name="run-001",
)

for step in range(1000):
    loss, accuracy = train_step()
    metrana.log("loss", loss)
    metrana.log("accuracy", accuracy)

metrana.close()

The API key can also be provided via the METRANA_API_KEY environment variable, in which case api_key can be omitted from init().

API Reference

`metrana.init()`

Initialises the logger. Must be called once before log() or close().

metrana.init(
    api_key: str,
    workspace_name: str,
    project_name: str,
    run_name: str,
    experiment_name: str | None = None,

    # Behavioural strategies (can also be set via environment variables)
    resume_strategy: str | None = None,       # "Never" (default) | "Allow"
    orchestration_id: str | None = None,      # Shared job token for distributed runs (see Resume strategy)
    backpressure_strategy: str | None = None, # "DropNew" | "Block" | "Raise"
    error_strategy: str | None = None,        # "Silent" | "Warn" | "RaiseOnLog" | "RaiseOnClose"
    close_strategy: str | None = None,        # "Immediate" | "CompletePending" | "CompleteAll"
    log_level: str | None = None,             # "Trace" | "Debug" | "Info" | "Success" | "Warn" | "Error" | "Critical" | "Off"

    # Aggregation rules - NOTE: this is disabled at this time.
    aggregation_rules: list[AggregationRule] | None = None,

    # Run config — logged as queryable run attributes
    config: dict | None = None,

    # Advanced
    num_dispatch_workers: int = 4,
    ingestion_url: str | None = None,         # Overrides the default API endpoint

    # Rendering (see Environment Renderings below)
    rendering_output_dir: str | Path | None = None, # Defaults to ~/.metrana/renderings
    rendering_fps: int = 30,
    rendering_max_concurrent_encoders: int = 1,
    rendering_queue_max_size: int | None = None,    # None / 0 = unbounded
)

`metrana.log()`

Logs a single metric value (or a dict of values). Thread-safe and non-blocking by default.

# Single metric
metrana.log("loss", 0.5)

# Multiple metrics at once
metrana.log({"loss": 0.5, "accuracy": 0.9})

Full signature:

metrana.log(
    metric_name: str | dict[str, float | int],
    value: float | int | None = None,     # Omit when metric_name is a dict
    scale: str | None = None,             # See Metric Scales below; defaults to "ML_STEP"
    step: int | None = None,              # Auto-increments per series — do not provide
    labels: dict[str, str] | None = None, # See Labels below
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    timestamp: int | None = None,         # Unix nanoseconds; defaults to now
)

step auto-increments per (metric_name, scale, labels) series. Do not provide it — manual step values are only appropriate in for unordered series. Incoming change: allow steps to be provided as long as they are monotonically increasing.

scale defaults to ML_STEP. For RL training, use the specialised helper methods to get RL environment/episode level series logged in the most efficient and scalable form.

`metrana.close()`

Shuts down the logger. Behaviour depends on the configured close_strategy.

metrana.close()

RL Helpers

The following functions are convenience wrappers around log() that fix the scale and ensure the backend treats them appropriately.

`metrana.log_rl_step()`

Logs a per-gradient-update metric on the ML_STEP scale.

metrana.log_rl_step(
    metric_name: str,
    value: float | int,
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    step: int | None = None,              # Auto-increments per series — do not provide
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)

`metrana.log_rl_episode()`

Logs a per-episode metric on the EPISODE scale. Automatically attaches rl_step and environment_id as labels so episode data can be correlated with training progress and individual environments.

metrana.log_rl_episode(
    metric_name: str,
    value: float | int,
    rl_step: int,                         # Current RL training step — required
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    episode: int | None = None,           # Auto-increments per series — do not provide
    env_id: str | None = None,            # Environment identifier
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)

episode is used as the step index for this series. It auto-increments — do not provide it unless restoring from a checkpoint.

`metrana.log_rl_environment_step()`

Logs a per-environment-interaction metric on the ENVIRONMENT_STEP scale. Automatically attaches episode, rl_step, and environment_id as labels.

metrana.log_rl_environment_step(
    metric_name: str,
    value: float | int,
    rl_step: int,                         # Current RL training step — required
    evaluation: bool = False,             # Injects "evaluation": "true" label when True
    env_step: int | None = None,          # Auto-increments per series — do not provide
    episode: int | None = None,           # Episode index label
    env_id: str | None = None,            # Environment identifier
    labels: dict[str, str] | None = None,
    timestamp: int | None = None,
)

env_step is used as the step index for this series. It auto-increments — do not provide it unless restoring from a checkpoint.

Environment Renderings

metrana.log_rendering() accepts a single rendered frame from an RL environment and asynchronously encodes it to a per-episode H.264 .mp4 file on the local filesystem. Frames sharing the same (env_id, episode) are appended to the same file; when either changes, the in-flight encoder for that env_id is closed and a new one is opened.

Requires the rendering extra (pip install 'metrana[rendering]'); calling log_rendering() without it raises ImportError with a message pointing at the extra. Encoding runs on a dedicated background thread and never blocks the calling thread (subject to the configured backpressure_strategy when the rendering queue is full).

metrana.log_rendering(
    frame: np.ndarray,
    rl_step: int,
    episode: int,
    env_id: str | None = None,
)

frame must be a uint8 numpy array of shape (H, W, 3) for RGB or (H, W) / (H, W, 1) for grayscale. H and W must both be even (libx264 yuv420p constraint). Frame size and colour mode are locked at the first frame of an episode and must remain consistent for the rest of that episode.

import numpy as np
import metrana

metrana.init(
    workspace_name="my-workspace",
    project_name="my-project",
    run_name="run-001",
    rendering_fps=30,
)

for episode in range(num_episodes):
    obs = env.reset()
    done = False
    while not done:
        frame = env.render()                 # uint8 (H, W, 3)
        metrana.log_rendering(
            frame=frame,
            rl_step=current_rl_step,
            episode=episode,
            env_id="env_0",
        )
        obs, _, done, _ = env.step(action(obs))

metrana.close()

Output files land at <rendering_output_dir>/<run_name>/<env_id>_<episode>.mp4 (default base: ~/.metrana/renderings). Episodes that produced no frames are deleted at close.

Concurrent encoders. rendering_max_concurrent_encoders caps the number of open encoders at any one time (default 1). Frames for additional env_ids beyond the cap are dropped (or blocked / raised) per backpressure_strategy. Only raise this if you know what you're doing — encoding is CPU-bound and PyAV/libx264 already use internal threads for the work.

FPS. rendering_fps is locked for the run. It cannot be changed mid-run.

Naming rules

Metric names, label keys, and label values become queryable identifiers in the Metrana query language (QL), so they cannot contain characters the QL parser reserves for its own use. The client validates them synchronously at the log() call site, raising MetranaMetricNameError; the same rules are enforced by the ingestion API and would be rejected server-side.

Metric names and label keys must:

not be empty
not start with :
not contain whitespace or any control / non-printable character
not contain any QL-reserved character: ( ) [ ] , = < > ! ~ " |

Label keys additionally cannot contain & or = (reserved by the k=v&k=v label storage encoding).

Label values are more permissive — whitespace and QL-reserved characters are allowed, since values are addressable through the QL quoted-string form. Label values must:

not contain any control / non-printable character
not contain & or =

The same rules apply to scale names and any other name-shaped identifier passed to log() and the RL helpers.

Labels

Labels are key-value pairs that identify a series. Two calls with different label sets create two independent series. This is intentional for splitting data by environment, agent, or other dimension — but means that labels whose values change on every call will create a new series each time, which is almost never what you want.

Use labels to split data along dimensions you want to filter or aggregate over (e.g. environment_id). For indexing within a series, rely on the auto-incrementing step.

Metric Scales

Scales define the x-axis semantics of a series. The specialised RL helpers fix the scale automatically; only use scale on log() directly when the helpers do not apply.

Scale	Use when
`ML_STEP`	One entry per gradient update / training step (default)
`EPISODE`	One entry per RL episode
`ENVIRONMENT_STEP`	One entry per RL environment interaction

The scale can be passed as a string or via metrana.StandardMetricScale:

from metrana import StandardMetricScale
metrana.log("reward", reward, scale=StandardMetricScale.EPISODE)

Aggregation Rules

Aggregation rules tell the ingestion worker how to derive new series from existing ones. They are declared once at run creation and applied automatically as data arrives.

NOTE: Aggregation rules are currently disabled on the backend.

from metrana import AggregationRule, AggregationFn

metrana.init(
    ...,
    aggregation_rules=[
        # Mean and max reward collapsed across environments.
        # aggregate_over_labels=["environment_id"] strips environment_id from
        # the output, merging all per-environment series into one.
        AggregationRule(
            source_scale="EPISODE",
            output_scale="EPISODE",
            fns=[AggregationFn.AGGREGATION_FN_MEAN, AggregationFn.AGGREGATION_FN_MAX],
            aggregate_over_labels=["environment_id"],
            output_name_suffix="/across_envs",
        ),
        # Min and sum of a specific metric per episode
        AggregationRule(
            metric_name="reward",
            source_scale="EPISODE",
            output_scale="EPISODE",
            fns=[AggregationFn.AGGREGATION_FN_MIN, AggregationFn.AGGREGATION_FN_SUM],
            output_metric_name="reward/final",
        ),
    ],
)

Rule fields

Field	Type	Description
`metric_name`	`str \| None`	Metric to apply the rule to. If absent, applies to every metric matching `source_scale` and `aggregate_over_labels`
`source_scale`	`str`	Scale of the source series (e.g. `"EPISODE"`, `"ENVIRONMENT_STEP"`)
`output_scale`	`str`	Scale of the derived output series
`fns`	`list[AggregationFn]`	Aggregation functions to apply. Each function produces a separate output series. At least one required.
`aggregate_over_labels`	`list[str]`	Labels to aggregate over and strip from the output. Series that share the same values for all other labels are merged together, and these labels disappear from the result. Empty list merges all matching series unconditionally.
`output_metric_name`	`str \| None`	Output series name. Only valid when `metric_name` is set; defaults to `metric_name`
`output_name_suffix`	`str \| None`	Suffix appended to each source metric name when `metric_name` is absent. Ignored when both `metric_name` and `output_metric_name` are set

Aggregation functions

Value	Description
`AggregationFn.AGGREGATION_FN_MEAN`	Mean of values in the group
`AggregationFn.AGGREGATION_FN_MAX`	Maximum value in the group
`AggregationFn.AGGREGATION_FN_SUM`	Sum of values in the group
`AggregationFn.AGGREGATION_FN_MIN`	Minimum value in the group
`AggregationFn.AGGREGATION_FN_STD_DEV`	Standard deviation of values in the group
`AggregationFn.AGGREGATION_FN_COUNT`	Count of values in the group

Strategies

Backpressure strategy

Controls what happens when the internal event queue is full.

Value	Behaviour
`DropNew`	Silently discard the incoming event (default)
`Block`	Block the calling thread until space is available
`Raise`	Raise `MetranaEventQueueFullError`

Error strategy

Controls how API errors are surfaced to the caller.

Value	Behaviour
`Silent`	Ignore errors
`Warn`	Log a warning and continue (default)
`RaiseOnLog`	Raise on the next `log()` call if errors have occurred
`RaiseOnClose`	Raise on `close()` if errors have occurred

Resume strategy

Controls what happens when a run with the same name already exists.

Value	Behaviour
`Never`	Create a new run; resume only if the existing run belongs to this same job (a distributed sibling or a restart), otherwise raise `MetranaRunAlreadyExistsError` (default)
`Allow`	Create a new run, or resume any existing run with the same name

Distributed training and the orchestration identifier

Under the Never resume strategy (the default), Metrana still lets the processes of a single distributed job share one run. When many ranks call init() with the same run_name, one wins the create race and the rest receive a 409 conflict. Metrana decides whether each conflict is a sibling (another process in this same job — resume the shared run) or a stale run left by an earlier job (raise MetranaRunAlreadyExistsError).

This decision uses an orchestration identifier: an opaque token shared by every process that should log to the same run. It is resolved, first match wins, from:

The orchestration_id argument to init().
The METRANA_ORCHESTRATION_ID environment variable.
A framework-provided job id: TORCHELASTIC_RUN_ID, SLURM_JOB_ID, or RAY_JOB_ID.
A randomly generated token (fallback).

Resume happens only when the existing run's stored id equals this process's id — that proves same-job ownership regardless of how long ago the run was created (so a sibling that lost the create race, or a worker that crashed and restarted mid-job, resumes). Every other case raises:

a different stored id raises MetranaOrchestrationIdMismatchError (definitive proof the run belongs to a separate job launch);
no stored id raises MetranaRunAlreadyExistsError (a pre-feature run, or one created by a client that sent no id — same-job ownership can't be confirmed).

The resolved id is published back into METRANA_ORCHESTRATION_ID so forked/spawned child processes inherit it automatically and match deterministically. Because the random-token fallback only matches descendants that inherit it, independently launched sibling processes (no shared framework job id, not forked from a common parent) must be given an explicit shared orchestration_id (or METRANA_ORCHESTRATION_ID) — otherwise each generates a distinct token and the losers of the create race raise.

# torchrun / Slurm / Ray job: nothing to do — the framework job id is picked up
# automatically and every rank resumes the shared run.
metrana.init(workspace_name="w", project_name="p", run_name="run-001")

# Custom launcher: pass a token shared by all workers of the job.
metrana.init(
    workspace_name="w",
    project_name="p",
    run_name="run-001",
    orchestration_id="my-job-2024-06-10-42",
)

Pre-warmed worker pools (e.g. Ray actors): processes that do not inherit this process's environment will not see the propagated METRANA_ORCHESTRATION_ID. Inject it via the launcher (e.g. Ray runtime_env) so every worker resolves the same token.

Both raised errors (MetranaOrchestrationIdMismatchError subclasses MetranaRunAlreadyExistsError, so a single except MetranaRunAlreadyExistsError catches both) carry creation_time and orchestration_identifier attributes describing the existing run.

Close strategy

Controls how pending events are handled on shutdown.

Value	Behaviour
`Immediate`	Shut down immediately, discarding pending events
`CompletePending`	Complete API requests already in flight, but discard events still queued (default)
`CompleteAll`	Wait for all queued events including those not yet dispatched

Environment Variables

All strategies and several other settings can be configured without code changes:

Variable	Default	Accepted values
`METRANA_API_KEY`	—	Your API key
`METRANA_BACKPRESSURE_STRATEGY`	`DropNew`	`DropNew`, `Block`, `Raise`
`METRANA_ERROR_MODES`	`Warn`	`Silent`, `Warn`, `RaiseOnLog`, `RaiseOnClose`
`METRANA_RESUME_STRATEGY`	`Never`	`Allow`, `Never`
`METRANA_ORCHESTRATION_ID`	—	Shared job token for distributed runs
`METRANA_CLOSE_STRATEGY`	`CompletePending`	`Immediate`, `CompletePending`, `CompleteAll`
`METRANA_LOG_LEVEL`	`Success`	`Trace`, `Debug`, `Info`, `Success`, `Warn`, `Error`, `Critical`, `Off`
`METRANA_EVENT_QUEUE_MAX_SIZE`	unbounded	Integer (`0` = unbounded)
`METRANA_DISPATCH_QUEUE_MAX_SIZE`	unbounded	Integer (`0` = unbounded)
`METRANA_ERROR_QUEUE_MAX_SIZE`	unbounded	Integer (`0` = unbounded)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.2

Jun 29, 2026

This version

0.4.3

Jun 22, 2026

0.3.6

May 29, 2026

0.3.5

May 13, 2026

0.3.4

May 11, 2026

0.3.3

May 8, 2026

0.3.2

May 6, 2026

0.3.1

Apr 30, 2026

0.3.0

Apr 29, 2026

0.2.1

Apr 22, 2026

0.2.0

Apr 22, 2026

0.1.1

Apr 16, 2026

0.1.0

Apr 7, 2026

0.0.9

Apr 3, 2026

0.0.8

Mar 31, 2026

0.0.6

Feb 25, 2026

0.0.5

Feb 25, 2026

0.0.4

Feb 24, 2026

0.0.3

Feb 20, 2026

0.0.1

Feb 9, 2026

0.0.0

Feb 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metrana-0.4.3.tar.gz (70.6 kB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

metrana-0.4.3-py3-none-any.whl (70.4 kB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file metrana-0.4.3.tar.gz.

File metadata

Download URL: metrana-0.4.3.tar.gz
Upload date: Jun 22, 2026
Size: 70.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for metrana-0.4.3.tar.gz
Algorithm	Hash digest
SHA256	`65e9cdfcd373203c3be776e6a0e1c93f492ba86f24346ee4c35c2df10ec6d480`
MD5	`83716f5fe88f2dccdd4c5bbf62ef4bf6`
BLAKE2b-256	`68d73520e6e6a1c626abfeacc79d7f3c8edfdef98498030d03969a8a4f12709f`

See more details on using hashes here.

File details

Details for the file metrana-0.4.3-py3-none-any.whl.

File metadata

Download URL: metrana-0.4.3-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 70.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for metrana-0.4.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`731a54bada4e5511c31de5c268dfbc6f7a29c66e32c7daccfac33d828df2a25c`
MD5	`bf4e6643e2854cc91bd89b78285c4bb1`
BLAKE2b-256	`c719565ee4a1925d5e58e2c35ffb9556fd760304886fcaa6770fb7fddb9a82b6`

See more details on using hashes here.

metrana 0.4.3

Navigation

Verified details

Owner

Unverified details

Meta

Classifiers

Project description

Metrana Client Library

Installation

Quick Start

API Reference

metrana.init()

metrana.log()

metrana.close()

RL Helpers

metrana.log_rl_step()

metrana.log_rl_episode()

metrana.log_rl_environment_step()

Environment Renderings

Naming rules

Labels

Metric Scales

Aggregation Rules

Rule fields

Aggregation functions

Strategies

Backpressure strategy

Error strategy

Resume strategy

Distributed training and the orchestration identifier

Close strategy

Environment Variables

Project details

Verified details

Owner

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`metrana.init()`

`metrana.log()`

`metrana.close()`

`metrana.log_rl_step()`

`metrana.log_rl_episode()`

`metrana.log_rl_environment_step()`