Skip to main content

Content safety evaluation tool - packaged by NVIDIA

Project description

NVIDIA NeMo Evaluator

The goal of NVIDIA NeMo Evaluator is to advance and refine state-of-the-art methodologies for model evaluation, and deliver them as modular evaluation packages (evaluation containers and pip wheels) that teams can use as standardized building blocks.

Quick Start Guide

NVIDIA NeMo Evaluator provides you with evaluation clients that are specifically built to evaluate model endpoints using our Standard API.

Prerequisites

Important: Both the model under test and the judge model must be deployed by the user locally before running evaluations.

Launching an Evaluation for an LLM

  1. Install the package:

    pip install nvidia-safety-harness
    
  2. Deploy your models locally:

    • Deploy the model you want to evaluate (model under test), e.g. on http://localhost:8000
    • Deploy the appropriate judge model for your evaluation type, e.g. on http://localhost:8001 (see Judge Configuration)
    • Both models should be accessible via HTTP API endpoints
  3. Authenticate with Hugging Face: You need to authenticate to the Hugging Face Hub as some datasets or models might need to be downloaded during evaluation.

    huggingface-cli login
    
  4. List the available evaluations:

    $ nemo-evaluator ls
    safety_eval: 
      * aegis_v2
      * aegis_v2_reasoning
      * wildguard
    
  5. (Optional) Set API keys for the model under test endpoint and the judge model endpoint, if they are protected:

    export MUT_API_KEY="your_api_key_here"
    export JUDGE_API_KEY="your_api_key_here"
    
  6. Run the evaluation:

     nemo-evaluator run_eval \
     --model_id "meta/llama-4-maverick-17b-128e-instruct" \
     --model_url http://localhost:8000/v1 \
     --model_type chat \
     --api_key_name MUT_API_KEY \
     --output_dir /workspace/results \
     --eval_type aegis_v2 \
     --overrides="config.params.extra.judge.url=http://localhost:8001/v1"
    
  7. Gather the results:

    cat /workspace/results/results.yml
    

CLI Specification

  1. Required flags:

    • --eval_type <string>: The type of evaluation to perform
    • --model_id <string>: The name or identifier of the model under test to evaluate
    • --model_url <url>: The API endpoint where the model under test is accessible
    • --model_type <string>: The type of the model under test to evaluate, currently either "chat", "completions", or "vlm"
    • --output_dir <directory>: The directory to use as the working directory for the evaluation. The results, including the results.yml output file, will be saved here. Make sure to use the absolute path
  2. Required overrides:

    • config.params.extra.judge.url: URL for the judge model endpoint
  3. Optional flags:

    • --api_key_name <string>: The name of the environment variable that stores the bearer token for the model under test API, if authentication is required (specify as "MUT_API_KEY" if needed)
    • --run_config <path>: Specifies the path to a YAML file containing the evaluation definition
    • --dry_run: Allows you to print the final configuration and command without executing the evaluation

Configuring Evaluations via YAML

Evaluations in NVIDIA NeMo Evaluator are configured using YAML files that define the parameters and settings required for the evaluation process. These configuration files follow a standard API which ensures consistency across evaluations.

Example of a YAML configuration:

config:
  type: aegis_v2
  params:
    limit_samples: 10
    extra:
      judge:
        url: http://localhost:8001/v1
target:
  api_endpoint:
    model_id: meta/llama-4-maverick-17b-128e-instruct
    type: chat
    url: http://localhost:8000/v1

The priority of overrides is as follows:

  1. Command-line arguments
  2. User configuration (as seen above)
  3. Task defaults (defined per task type)
  4. Framework defaults

Example:

nemo-evaluator run_eval \
    --run_config config.yaml \
    --api_key_name MUT_API_KEY \
    --output_dir /workspace/results

With --dry_run, the following configuration is printed to the console:

nemo-evaluator run_eval \
    --run_config config.yaml \
    --api_key_name MUT_API_KEY \
    --output_dir /workspace/results \
    --dry_run

Output:

Rendered config:

command: '{% if target.api_endpoint.api_key_name is not none %}export API_KEY=${{target.api_endpoint.api_key_name}}  &&
  {% endif %} {% if config.params.extra.judge.api_key is not none %}export JUDGE_API_KEY=${{config.params.extra.judge.api_key}}
  && {% endif %} safety-eval  --model-name  {{target.api_endpoint.model_id}} --model-url
  {{target.api_endpoint.url}} --model-type {{target.api_endpoint.type}}  --judge-url  {{config.params.extra.judge.url}}   --results-dir
  {{config.output_dir}}   --eval {{config.params.task}}  --mut-inference-params max_tokens={{config.params.max_new_tokens}},temperature={{config.params.temperature}},top_p={{config.params.top_p}},timeout={{config.params.request_timeout}},concurrency={{config.params.parallelism}},retries={{config.params.max_retries}}
  --judge-inference-params concurrency={{config.params.extra.judge.parallelism}},retries={{config.params.max_retries}}  {%
  if config.params.limit_samples is not none %} --limit {{config.params.limit_samples}}
  {% endif %} {% if config.params.extra.judge.model_id is not none %} --judge-model-name
  {{config.params.extra.judge.model_id}} {% endif %}'
framework_name: safety_eval
pkg_name: safety_eval
config:
  output_dir: /workspace/results
  params:
    limit_samples: 10
    max_new_tokens: 6144
    max_retries: 5
    parallelism: 8
    task: aegis_v2
    temperature: 0.6
    request_timeout: 30
    top_p: 0.95
    extra:
      judge:
        url: http://localhost:8001/v1
        model_id: null
        api_key: null
        parallelism: 32
        request_timeout: 60
        max_retries: 16
  supported_endpoint_types:
  - chat
  - completions
  type: aegis_v2
target:
  api_endpoint:
    api_key_name: MUT_API_KEY
    model_id: meta/llama-4-maverick-17b-128e-instruct
    stream: false
    type: chat
    url: http://localhost:8000/v1


Rendered command:

export API_KEY=$MUT_API_KEY  &&   safety-eval  --model-name  meta/llama-4-maverick-17b-128e-instruct --model-url http://localhost:8000/v1 --model-type chat  --judge-url  http://localhost:8001/v1   --results-dir /workspace/results   --eval aegis_v2  --mut-inference-params max_tokens=6144,temperature=0.6,top_p=0.95,timeout=30,concurrency=8,retries=5 --judge-inference-params concurrency=32,retries=5   --limit 10

Tasks Configuration

Each evaluation requires its own judge model to be deployed by the user, e.g. on http://localhost:8001.

Aegis_v2

Judge Model: llama-3.1-nemoguard-8b-content-safety

  • Deployment: Follow NIM deployment guide
  • Endpoint Type: Completion endpoint
  • Tokenizer: meta-llama/Llama-3.1-8B-Instruct (ensure Hugging Face access)

Please note that if you evaluate reasoning model with aegis_v2, you should use aegis_v2_reasoning that is going to evaluate responses with and without thinking traces. No reasoning adapter should be turned on for that evaluation.

Wildguard

Judge Model: allenai/wildguard

  • Deployment: Use vLLM version v0.8.5
  • Endpoint Type: Completion endpoint
  • Tokenizer: mistralai/Mistral-7B-v0.3 (base model tokenizer)
  • Command:
    docker run -it --gpus all -p 8001:8000 vllm/vllm-openai:v0.8.5 --model allenai/wildguard
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nvidia_safety_harness-26.1-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file nvidia_safety_harness-26.1-py3-none-any.whl.

File metadata

File hashes

Hashes for nvidia_safety_harness-26.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ba5e9fd50c30f369179583903a0e8687deff5352dca35d019eb3db1d80516726
MD5 6bee7dd937d22c27f474ca24f45d4ea3
BLAKE2b-256 c27c56f67b57e6fe2c1eece4c8f9ce79ce1a0ba1f9dcd1efe270e98e97c344f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page