Skip to main content

Common utilities for NVIDIA evaluation frameworks

Project description

NeMo Evaluator

For complete documentation, please see: docs/nemo-evaluator/index.md

Custom Benchmarks with BYOB

Create custom evaluation benchmarks in ~12 lines of Python using the BYOB (Bring Your Own Benchmark) framework:

from nemo_evaluator.contrib.byob import benchmark, scorer

@benchmark(name="my-qa", dataset="data.jsonl", prompt="Q: {question}\nA:", target_field="answer")
@scorer
def check(response: str, target: str, metadata: dict) -> dict:
    return {"correct": target.lower() in response.lower()}
# Compile and run
nemo-evaluator-byob my_benchmark.py
nemo-evaluator run_eval --eval_type byob_my_qa.my-qa --model_url http://localhost:8000 --model_id my-model

See the BYOB quickstart guide for full documentation, built-in scorers, and examples.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nemo_evaluator-0.2.5.tar.gz (167.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nemo_evaluator-0.2.5-py3-none-any.whl (214.9 kB view details)

Uploaded Python 3

File details

Details for the file nemo_evaluator-0.2.5.tar.gz.

File metadata

  • Download URL: nemo_evaluator-0.2.5.tar.gz
  • Upload date:
  • Size: 167.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.5.tar.gz
Algorithm Hash digest
SHA256 e71ce5bc43567f74d92cfcec2c1dee7d62018dbdedc441757574c3f258fbcfce
MD5 55cceb78b45eed3b7868d315a8af2906
BLAKE2b-256 7a305cf0fafffbf9f0ceac3b121b31360ab2fceb39520be4a30da823167e35bf

See more details on using hashes here.

File details

Details for the file nemo_evaluator-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: nemo_evaluator-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 214.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.3

File hashes

Hashes for nemo_evaluator-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 8e2bd0cfea8653daec4018efa3b11fb8f4327b5295226898971ba78361902345
MD5 81770d87adbc375a342956e26576a613
BLAKE2b-256 1998f119292fec8d1f22a859ed1426d01d84d444ab673aef6e1919cd924a5936

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page