Skip to main content

Content safety evaluation tool - packaged by NVIDIA

Project description

NVIDIA NeMo Evaluator

The goal of NVIDIA NeMo Evaluator is to advance and refine state-of-the-art methodologies for model evaluation, and deliver them as modular evaluation packages (evaluation containers and pip wheels) that teams can use as standardized building blocks.

Quick Start Guide

NVIDIA NeMo Evaluator provides you with evaluation clients that are specifically built to evaluate model endpoints using our Standard API.

Prerequisites

Important: Both the model under test and the judge model must be deployed by the user locally before running evaluations.

Launching an Evaluation for an LLM

  1. Install the package:

    pip install nvidia-safety-harness
    
  2. Deploy your models locally:

    • Deploy the model you want to evaluate (model under test), e.g. on http://localhost:8000
    • Deploy the appropriate judge model for your evaluation type, e.g. on http://localhost:8001 (see Judge Configuration)
    • Both models should be accessible via HTTP API endpoints
  3. Authenticate with Hugging Face: You need to authenticate to the Hugging Face Hub as some datasets or models might need to be downloaded during evaluation.

    huggingface-cli login
    
  4. List the available evaluations:

    $ nemo-evaluator ls
    safety_eval: 
      * aegis_v2
      * aegis_v2_reasoning
      * wildguard
    
  5. (Optional) Set API keys for the model under test endpoint and the judge model endpoint, if they are protected:

    export MUT_API_KEY="your_api_key_here"
    export JUDGE_API_KEY="your_api_key_here"
    
  6. Run the evaluation:

     nemo-evaluator run_eval \
     --model_id "meta/llama-4-maverick-17b-128e-instruct" \
     --model_url http://localhost:8000/v1 \
     --model_type chat \
     --api_key_name MUT_API_KEY \
     --output_dir /workspace/results \
     --eval_type aegis_v2 \
     --overrides="config.params.extra.judge.url=http://localhost:8001/v1"
    
  7. Gather the results:

    cat /workspace/results/results.yml
    

CLI Specification

  1. Required flags:

    • --eval_type <string>: The type of evaluation to perform
    • --model_id <string>: The name or identifier of the model under test to evaluate
    • --model_url <url>: The API endpoint where the model under test is accessible
    • --model_type <string>: The type of the model under test to evaluate, currently either "chat", "completions", or "vlm"
    • --output_dir <directory>: The directory to use as the working directory for the evaluation. The results, including the results.yml output file, will be saved here. Make sure to use the absolute path
  2. Required overrides:

    • config.params.extra.judge.url: URL for the judge model endpoint
  3. Optional flags:

    • --api_key_name <string>: The name of the environment variable that stores the bearer token for the model under test API, if authentication is required (specify as "MUT_API_KEY" if needed)
    • --run_config <path>: Specifies the path to a YAML file containing the evaluation definition
    • --dry_run: Allows you to print the final configuration and command without executing the evaluation

Configuring Evaluations via YAML

Evaluations in NVIDIA NeMo Evaluator are configured using YAML files that define the parameters and settings required for the evaluation process. These configuration files follow a standard API which ensures consistency across evaluations.

Example of a YAML configuration:

config:
  type: aegis_v2
  params:
    limit_samples: 10
    extra:
      judge:
        url: http://localhost:8001/v1
target:
  api_endpoint:
    model_id: meta/llama-4-maverick-17b-128e-instruct
    type: chat
    url: http://localhost:8000/v1

The priority of overrides is as follows:

  1. Command-line arguments
  2. User configuration (as seen above)
  3. Task defaults (defined per task type)
  4. Framework defaults

Example:

nemo-evaluator run_eval \
    --run_config config.yaml \
    --api_key_name MUT_API_KEY \
    --output_dir /workspace/results

With --dry_run, the following configuration is printed to the console:

nemo-evaluator run_eval \
    --run_config config.yaml \
    --api_key_name MUT_API_KEY \
    --output_dir /workspace/results \
    --dry_run

Output:

Rendered config:

command: '{% if target.api_endpoint.api_key is not none %}export API_KEY=${{target.api_endpoint.api_key}}  &&
  {% endif %} {% if config.params.extra.judge.api_key is not none %}export JUDGE_API_KEY=${{config.params.extra.judge.api_key}}
  && {% endif %} safety-eval  --model-name  {{target.api_endpoint.model_id}} --model-url
  {{target.api_endpoint.url}} --model-type {{target.api_endpoint.type}}  --judge-url  {{config.params.extra.judge.url}}   --results-dir
  {{config.output_dir}}   --eval {{config.params.task}}  --mut-inference-params max_tokens={{config.params.max_new_tokens}},temperature={{config.params.temperature}},top_p={{config.params.top_p}},timeout={{config.params.request_timeout}},concurrency={{config.params.parallelism}},retries={{config.params.max_retries}}
  --judge-inference-params concurrency={{config.params.extra.judge.parallelism}},retries={{config.params.max_retries}}  {%
  if config.params.limit_samples is not none %} --limit {{config.params.limit_samples}}
  {% endif %} {% if config.params.extra.judge.model_id is not none %} --judge-model-name
  {{config.params.extra.judge.model_id}} {% endif %}'
framework_name: safety_eval
pkg_name: safety_eval
config:
  output_dir: /workspace/results
  params:
    limit_samples: 10
    max_new_tokens: 6144
    max_retries: 5
    parallelism: 8
    task: aegis_v2
    temperature: 0.6
    request_timeout: 30
    top_p: 0.95
    extra:
      judge:
        url: http://localhost:8001/v1
        model_id: null
        api_key: null
        parallelism: 32
        request_timeout: 60
        max_retries: 16
  supported_endpoint_types:
  - chat
  - completions
  type: aegis_v2
target:
  api_endpoint:
    api_key: MUT_API_KEY
    model_id: meta/llama-4-maverick-17b-128e-instruct
    stream: false
    type: chat
    url: http://localhost:8000/v1


Rendered command:

export API_KEY=$MUT_API_KEY  &&   safety-eval  --model-name  meta/llama-4-maverick-17b-128e-instruct --model-url http://localhost:8000/v1 --model-type chat  --judge-url  http://localhost:8001/v1   --results-dir /workspace/results   --eval aegis_v2  --mut-inference-params max_tokens=6144,temperature=0.6,top_p=0.95,timeout=30,concurrency=8,retries=5 --judge-inference-params concurrency=32,retries=5   --limit 10

Tasks Configuration

Each evaluation requires its own judge model to be deployed by the user, e.g. on http://localhost:8001.

Aegis_v2

Judge Model: llama-3.1-nemoguard-8b-content-safety

  • Deployment: Follow NIM deployment guide
  • Endpoint Type: Completion endpoint
  • Tokenizer: meta-llama/Llama-3.1-8B-Instruct (ensure Hugging Face access)

Please note that if you evaluate reasoning model with aegis_v2, you should use aegis_v2_reasoning that is going to evaluate responses with and without thinking traces. No reasoning adapter should be turned on for that evaluation.

Wildguard

Judge Model: allenai/wildguard

  • Deployment: Use vLLM version v0.8.5
  • Endpoint Type: Completion endpoint
  • Tokenizer: mistralai/Mistral-7B-v0.3 (base model tokenizer)
  • Command:
    docker run -it --gpus all -p 8001:8000 vllm/vllm-openai:v0.8.5 --model allenai/wildguard
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nvidia_safety_harness-25.11-py3-none-any.whl (30.6 kB view details)

Uploaded Python 3

File details

Details for the file nvidia_safety_harness-25.11-py3-none-any.whl.

File metadata

File hashes

Hashes for nvidia_safety_harness-25.11-py3-none-any.whl
Algorithm Hash digest
SHA256 6bf179c2995b7c0f2ea0e6984f3a9d82a791e6ddd2f252c0bd745fb3480d973d
MD5 dcbf79a4bd88d7b3d29e7bb922bcd4c8
BLAKE2b-256 2ebfe4a7b9178ca5680ff227769e29d62389ed8075abf9ca95a41e87b42fb8bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page