Skip to main content

Visualize OpenAI evals with Zeno

Project description

Zeno 🤝 OpenAI Evals

Use Zeno to visualize the results of OpenAI Evals.

Usage

pip install zeno-evals

Run an evaluation following the evals instructions. This will produce a cache file in /tmp/evallogs/.

Pass this file to the zeno-evals command:

zeno-evals /tmp/evallogs/my_eval_cache.jsonl

Example

We include an example looking at the MedMCQA dataset (Thanks to @SinanAkkoyun):

zeno-evals ./example_medicine/example.jsonl --functions_file=./example_medicine/distill.py

Todo

  • Support model-graded evaluations
  • Support custom evaluation templates (e.g. BLEU for translation)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zeno_evals-0.1.2.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

zeno_evals-0.1.2-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page