Skip to main content

Common utilities for NVIDIA evaluation frameworks

Project description

NeMo Evaluator

For complete documentation, please see: docs/nemo-evaluator/index.md

Custom Benchmarks with BYOB

Create custom evaluation benchmarks in ~12 lines of Python using the BYOB (Bring Your Own Benchmark) framework:

from nemo_evaluator.contrib.byob import benchmark, scorer

@benchmark(name="my-qa", dataset="data.jsonl", prompt="Q: {question}\nA:", target_field="answer")
@scorer
def check(response: str, target: str, metadata: dict) -> dict:
    return {"correct": target.lower() in response.lower()}
# Compile and run
nemo-evaluator-byob my_benchmark.py
nemo-evaluator run_eval --eval_type byob_my_qa.my-qa --model_url http://localhost:8000 --model_id my-model

See the BYOB quickstart guide for full documentation, built-in scorers, and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nemo_evaluator-0.2.7.tar.gz (171.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nemo_evaluator-0.2.7-py3-none-any.whl (218.6 kB view details)

Uploaded Python 3

File details

Details for the file nemo_evaluator-0.2.7.tar.gz.

File metadata

  • Download URL: nemo_evaluator-0.2.7.tar.gz
  • Upload date:
  • Size: 171.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.7.tar.gz
Algorithm Hash digest
SHA256 11afd991bf19e50f535abd813c4452063cee91c5b61f88ac741858912ae3674b
MD5 4f29c24dd454193073bd0954166050e4
BLAKE2b-256 773034f80c1727d42e612edcd7aefb5366f3180e6ba9504b6b7f1eb36ab3b668

See more details on using hashes here.

File details

Details for the file nemo_evaluator-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: nemo_evaluator-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 218.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 2001a5449dc2e9c924805a56ad626a408e9d7664b5d4ffc6fde1973725f405f3
MD5 399fe545bf4816555dd96ce2432c0396
BLAKE2b-256 6beb26ee490b97d7f8011db5a3d17f148c2b1741c8b7a576b450e7072f45608c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page