Skip to main content

Common utilities for NVIDIA evaluation frameworks

Project description

NeMo Evaluator

For complete documentation, please see: docs/nemo-evaluator/index.md

Custom Benchmarks with BYOB

Create custom evaluation benchmarks in ~12 lines of Python using the BYOB (Bring Your Own Benchmark) framework:

from nemo_evaluator.contrib.byob import benchmark, scorer

@benchmark(name="my-qa", dataset="data.jsonl", prompt="Q: {question}\nA:", target_field="answer")
@scorer
def check(response: str, target: str, metadata: dict) -> dict:
    return {"correct": target.lower() in response.lower()}
# Compile and run
nemo-evaluator-byob my_benchmark.py
nemo-evaluator run_eval --eval_type byob_my_qa.my-qa --model_url http://localhost:8000 --model_id my-model

See the BYOB quickstart guide for full documentation, built-in scorers, and examples.

Project details


Release history Release notifications | RSS feed

This version

0.2.8

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nemo_evaluator-0.2.8.tar.gz (183.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nemo_evaluator-0.2.8-py3-none-any.whl (230.6 kB view details)

Uploaded Python 3

File details

Details for the file nemo_evaluator-0.2.8.tar.gz.

File metadata

  • Download URL: nemo_evaluator-0.2.8.tar.gz
  • Upload date:
  • Size: 183.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.8.tar.gz
Algorithm Hash digest
SHA256 3d7b52f41eff5f6ac3c07c95bf3d9c4976322087d82a6e75e7e503ece02a8f26
MD5 83e4bff8188449cc41f8ecd5e80b7436
BLAKE2b-256 cef6d9273a828e69c9d01dd5924b5040206ff5480d8228a9048cdeced75ea096

See more details on using hashes here.

File details

Details for the file nemo_evaluator-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: nemo_evaluator-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 230.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 231c1c03d83fae47be0f18ab099e03c2fabf0955c18d17753a7a0b6aef76868f
MD5 92657436424d800a6e7e726846eca5a5
BLAKE2b-256 21a37cb0292feaad9eda43c2ec4ff9e574e0e7aba20b06bbe10f7c7ff3a24e88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page