Skip to main content

A lightweight metrics and constraint evaluation framework

Project description

dot-metrics

Python Version Coverage

dot-metrics is a lightweight metrics and constraint evaluation framework. Define metrics and constraints, run them against your data, and get structured results with debug info.

Install

pip install dot-metrics

Concept

A MetricSet holds metric and constraint definitions. Call compute(data) to evaluate them.

MetricSet
 ├── metrics:     {"coverage": MetricDefinition}
 └── constraints: {"errors":   ConstraintDefinition}
          │
          ▼
     set.compute(data)
          │
          ▼
      EvalResult
       ├── metrics:     {"coverage": Metric}
       └── constraints: {"errors":   Constraint}

Metrics are continuous measurements (e.g. coverage rate, score). Constraints are pass/fail checks against a threshold (e.g. error count ≤ 0).

A constraint passes when value <= threshold.

Quick start

from dot_metrics import MetricSet

metric_set = MetricSet()

@metric_set.metric("coverage")
def coverage(data):
    return data["covered"] / data["total"]

@metric_set.constraint("errors", threshold=0)
def errors(data):
    return data["error_count"]

result = metric_set.compute({"covered": 90, "total": 100, "error_count": 0})

result.score("coverage")     # 0.9
result.constraints_ok        # True

Defining metrics and constraints

Decorator style

metric_set = MetricSet()

@metric_set.metric("latency_ms", unit="ms", higher_is_better=False)
def latency(data):
    return data["total_ms"] / data["requests"]

@metric_set.constraint("error_rate", threshold=0.01, unit="%")
def error_rate(data):
    return data["errors"] / data["requests"]

Imperative style

metric_set = MetricSet()
metric_set.add("coverage", lambda data: data["covered"] / data["total"])
metric_set.add_constraint("errors", lambda data: data["error_count"], threshold=0)

Both styles are equivalent. add() and add_constraint() accept the same keyword arguments as the decorators.

Parameters

metric_set.metric(key, *, unit="", description="", higher_is_better=True, metadata={}) metric_set.add(key, fn, *, unit="", description="", higher_is_better=True, metadata={})

metric_set.constraint(key, *, threshold, unit="", description="", metadata={}) metric_set.add_constraint(key, fn, *, threshold, unit="", description="", metadata={})

  • key — unique name for the metric/constraint
  • threshold — constraint passes when value <= threshold
  • higher_is_better — affects terminal chart rendering
  • metadata — arbitrary dict, passed through to results

Computing results

result = metric_set.compute(data)

Every metric and constraint function must accept exactly one argument — the data object. data can be anything: a dict, dataclass, Pydantic model, etc.

Accessing results

result.score("coverage")                  # float — metric value
result.metrics["coverage"].value          # same
result.metrics["coverage"].unit           # ""
result.metrics["coverage"].debug          # {} by default

result.constraints["errors"].value        # float
result.constraints["errors"].passed       # True/False
result.constraints["errors"].threshold    # 0

result.constraints_ok                     # True if all constraints passed
result.violations                         # list of failed Constraint objects
result.assert_constraints()               # raises ValueError if any failed

Attaching debug info

Return a ComputedValue instead of a plain float to attach structured debug data:

from dot_metrics import MetricSet, ComputedValue

metric_set = MetricSet()

@metric_set.metric("coverage")
def coverage(data):
    missed = [x for x in data if not x["covered"]]
    return ComputedValue(value=1 - len(missed) / len(data), debug={"missed": missed})

result = metric_set.compute(data)
result.metrics["coverage"].debug    # {"missed": [...]}

ComputedValue works the same way for constraints.

Batch evaluation

Evaluate a set of inputs in one call:

# dict of inputs
batch = metric_set.compute_batch({"run_1": data1, "run_2": data2})
batch["run_1"].score("coverage")            # 0.9
batch.scores("coverage")                   # {"run_1": 0.9, "run_2": 0.85}

# list of inputs
batch = metric_set.compute_batch([data1, data2, data3])
batch[0].score("coverage")                 # indexed by position

BatchResult supports iteration, len(), and .items().

Metric documentation

Add a Google-style docstring to a metric or constraint function and it gets parsed into a help dict on both the definition and the result:

@metric_set.metric("coverage", unit="%")
def coverage(data):
    """Percentage of code paths covered by tests.

    Range: 0-100
    Interpretation:
        - 90-100: Excellent
        - 70-90:  Good
        - <70:    Needs improvement
    Notes:
        - Returns 0 for empty input.
    """
    return sum(1 for x in data if x["covered"]) / len(data)

metric_set.metrics["coverage"].help
# {
#   "summary": "Percentage of code paths covered by tests.",
#   "range": "0-100",
#   "interpretation": "- 90-100: Excellent\n- 70-90:  Good\n- <70:    Needs improvement",
#   "notes": "- Returns 0 for empty input."
# }

result = metric_set.compute(data)
result.metrics["coverage"].help     # same dict

Supported sections: Range:, Interpretation:, Notes:. No docstring → help is {}.

Typing

MetricSet is generic over the input type T. Annotating it lets static type checkers (mypy, pyright) verify that every registered function accepts the right type:

from dataclasses import dataclass
from dot_metrics import MetricSet

@dataclass
class SchedulingData:
    appointments: list
    solution: list

metric_set: MetricSet[SchedulingData] = MetricSet()
metric_set.add("rate", lambda d: len(d.solution) / len(d.appointments))

result = metric_set.compute(SchedulingData(appointments=[...], solution=[...]))

The annotation is optional — omitting it is fine and everything still works at runtime.

Interactive explorer

Explore batch results interactively in the browser with scatter, bar, and heatmap charts.

pip install dot-metrics[explore]
from dot_metrics import MetricSet
from dot_metrics.explore import serve

ms = MetricSet()
ms.add("score", lambda x: x["s"])
ms.add_constraint("errors", lambda x: x["e"], threshold=1)

batch = ms.compute_batch({
    ("gpt4", "en"): {"s": 0.9, "e": 0},
    ("gpt4", "fr"): {"s": 0.7, "e": 2},
    ("llama", "en"): {"s": 0.8, "e": 0},
    ("llama", "fr"): {"s": 0.6, "e": 1},
}, key_names=["model", "language"])

serve(batch)  # opens localhost:8050

The app provides:

  • Chart — scatter, bar, or heatmap with configurable X, Y, color, and size axes
  • Aggregation panel — compute mean/median/min/max grouped by any categorical column on the fly
  • Data table — sortable, with debug cell inspection and CSV export

Pass key_names to compute_batch to label tuple key components (defaults to key[0], key[1], …). You can also pass a single EvalResult instead of a BatchResult.

serve(data, *, host="127.0.0.1", port=8050, debug=False)

Terminal chart

from dot_metrics import draw_terminal_chart

print(draw_terminal_chart(result))
# coverage  ████████████████████  0.90

draw_terminal_chart(result, width=40, char="█") returns a string — use print() to display it.

Full example

from dot_metrics import MetricSet, ComputedValue

appointments = [
    {"id": "A1", "patient": "Alice",   "duration": 30},
    {"id": "A2", "patient": "Bob",     "duration": 60},
    {"id": "A3", "patient": "Charlie", "duration": 30},
]

solution = [
    {"appointment_id": "A1", "practitioner": "Dr. Martin", "slot": "09:00", "scheduled": True},
    {"appointment_id": "A2", "practitioner": "Dr. Martin", "slot": "09:00", "scheduled": True},  # conflict!
    {"appointment_id": "A3", "practitioner": "Dr. Martin", "slot": "10:00", "scheduled": True},
]

metric_set = MetricSet()

@metric_set.metric("scheduling_rate")
def scheduling_rate(data):
    scheduled = [e for e in data["solution"] if e["scheduled"]]
    unscheduled = [e["appointment_id"] for e in data["solution"] if not e["scheduled"]]
    return ComputedValue(value=len(scheduled) / len(data["appointments"]), debug={"unscheduled": unscheduled})

@metric_set.constraint("conflicts", threshold=0)
def count_conflicts(data):
    seen = {}
    conflicts = []
    for entry in data["solution"]:
        key = (entry["practitioner"], entry["slot"])
        if key in seen:
            conflicts.append((seen[key], entry["appointment_id"]))
        seen[key] = entry["appointment_id"]
    return ComputedValue(value=len(conflicts), debug={"conflicts": conflicts})

result = metric_set.compute({"appointments": appointments, "solution": solution})

result.score("scheduling_rate")                     # 1.0
result.constraints_ok                               # False
result.constraints["conflicts"].debug               # {"conflicts": [("A1", "A2")]}

Reference

Import Description
MetricSet Main class — holds definitions, runs computation
EvalResult Output of compute() — holds Metric and Constraint dicts
BatchResult Output of compute_batch() — maps keys to EvalResult
ComputedValue Wraps a float return value with optional debug data
Metric Computed metric result
Constraint Computed constraint result with passed flag
MetricDefinition Stored metric definition (in metric_set.metrics)
ConstraintDefinition Stored constraint definition (in metric_set.constraints)
draw_terminal_chart Renders a Unicode bar chart from an EvalResult
explore.serve Launches an interactive Dash explorer (requires pip install dot-metrics[explore])

Contributing & Development

See docs/CONTRIBUTING.md and docs/DEVELOPMENT.md.

License

See LICENSE for details.

Contact

deepika Team — contact@deepika.ai Project: gitlab.com/deepika6190303/deepika-open-toolbox/dot-metrics

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dot_metrics-0.2.0.tar.gz (77.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dot_metrics-0.2.0-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file dot_metrics-0.2.0.tar.gz.

File metadata

  • Download URL: dot_metrics-0.2.0.tar.gz
  • Upload date:
  • Size: 77.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dot_metrics-0.2.0.tar.gz
Algorithm Hash digest
SHA256 48cb51050d96d1f08357c6a6da779ee04c830e8dddc1523da30115323034a867
MD5 2ad62d5b77ac2f5d4c87a9c5472d9a0e
BLAKE2b-256 3e07a02db98522e4b62c42a0612b43b9ff1626ed8d424df20b2e5932ffeff706

See more details on using hashes here.

File details

Details for the file dot_metrics-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dot_metrics-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for dot_metrics-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 efc655febbfb20b5ecd3d64c14095ba3801af8b82f3ffc1eb07f1ac5317e32ca
MD5 8267d00afa4e7f3d2e454bc902a45eff
BLAKE2b-256 9d3b568e5c8752b64d500b9721b5b5a28373d464437680ac140ecd9327d6ae80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page