Skip to main content

Common utilities for NVIDIA evaluation frameworks

Project description

NeMo Evaluator

For complete documentation, please see: docs/nemo-evaluator/index.md

Custom Benchmarks with BYOB

Create custom evaluation benchmarks in ~12 lines of Python using the BYOB (Bring Your Own Benchmark) framework:

from nemo_evaluator.contrib.byob import benchmark, scorer

@benchmark(name="my-qa", dataset="data.jsonl", prompt="Q: {question}\nA:", target_field="answer")
@scorer
def check(response: str, target: str, metadata: dict) -> dict:
    return {"correct": target.lower() in response.lower()}
# Compile and run
nemo-evaluator-byob my_benchmark.py
nemo-evaluator run_eval --eval_type byob_my_qa.my-qa --model_url http://localhost:8000 --model_id my-model

See the BYOB quickstart guide for full documentation, built-in scorers, and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nemo_evaluator-0.2.6.tar.gz (170.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nemo_evaluator-0.2.6-py3-none-any.whl (217.4 kB view details)

Uploaded Python 3

File details

Details for the file nemo_evaluator-0.2.6.tar.gz.

File metadata

  • Download URL: nemo_evaluator-0.2.6.tar.gz
  • Upload date:
  • Size: 170.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.6.tar.gz
Algorithm Hash digest
SHA256 4a26d1916b6416146cbab7a27a331e1a493329188feafa6736b92412e60a7de2
MD5 ea335790067d142fec93c46ba2ad38d8
BLAKE2b-256 80404c86cfc06a921a9047476cf316a859e70302cfc8b76d0f8efe54aef2fdd0

See more details on using hashes here.

File details

Details for the file nemo_evaluator-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: nemo_evaluator-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 217.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a630765585e6ce57b16c08a2a53b36c452821e502bfb8a7210f33a4b2d092ebe
MD5 422fc8e0e0cb9e48c0d02805bba05c84
BLAKE2b-256 c02f204386cbd3eeeb083763bbc04b346270979ca9c266bd08bb08c3323f532e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page