ZeroEval SDK

These details have not been verified by PyPI

Project links

Repository

Project description

ZeroEval SDK

The Python SDK and CLI for ZeroEval -- monitoring, prompt management, judges, and optimization for AI products.

pip install zeroeval

Quick Start

1. Setup

zeroeval setup

This opens the ZeroEval dashboard, prompts for your project API key, and saves it along with your project context. Every command after this just works.

2. Trace your AI calls

import zeroeval as ze
import openai

ze.init()
client = openai.OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is ZeroEval?"}],
)

OpenAI, Gemini, LangChain, and LangGraph calls are automatically traced. No extra code needed.

3. Manage prompts

import zeroeval as ze

ze.init()

prompt = ze.prompt(
    name="support-triage",
    content="Classify this support ticket: {{ticket_text}}",
    variables={"ticket_text": "I can't log in to my account"},
)

Prompts are versioned, tagged, and linked to traces automatically.

4. Inspect from the CLI

zeroeval traces list --start-date 2025-01-01
zeroeval judges list
zeroeval prompts list

Installation

pip install zeroeval            # Core SDK
pip install zeroeval[openai]    # OpenAI auto-instrumentation
pip install zeroeval[gemini]    # Google Gemini
pip install zeroeval[langchain] # LangChain
pip install zeroeval[langgraph] # LangGraph
pip install zeroeval[all]       # Everything

Integrations are detected and instrumented automatically.

Authentication

Interactive (recommended):

zeroeval setup

Saves your API key and resolves your project automatically. You're ready to go.

Non-interactive (CI, agents):

zeroeval auth set --api-key sk_ze_...

In code:

import zeroeval as ze
ze.init(api_key="sk_ze_...")

Resolution order: explicit flag > environment variable (ZEROEVAL_API_KEY, ZEROEVAL_PROJECT_ID) > CLI config (~/.config/zeroeval/config.json).

Tracing

The SDK traces AI calls with a Session > Trace > Span hierarchy.

import zeroeval as ze

ze.init()

# Decorator-based tracing
@ze.span(name="process-ticket")
def process_ticket(ticket_text: str):
    # your logic here
    return result

# Context manager
with ze.span(name="embedding-step"):
    embeddings = get_embeddings(text)

# Manual spans
span = ze.start_span(name="retrieval")
results = search(query)
span.end()

# Tags for filtering in the dashboard
ze.set_tag("trace", {"environment": "production", "model": "gpt-4o"})

See INTEGRATIONS.md for automatic OpenAI, Gemini, LangChain, and LangGraph tracing.

Prompts

Prompts are versioned and tagged. Use ze.prompt() to fetch and render them:

prompt = ze.prompt(
    name="support-triage",
    content="Classify: {{ticket_text}}",
    variables={"ticket_text": ticket},
)

# Fetch a specific version or tag
prompt = ze.get_prompt("support-triage", version=3)
prompt = ze.get_prompt("support-triage", tag="production")

Prompts are automatically linked to traces and available for optimization.

Feedback

Submit feedback on prompt completions to build training data for optimization:

# Thumbs up/down
ze.send_feedback(
    prompt_slug="support-triage",
    completion_id="completion-uuid",
    thumbs_up=True,
    reason="Correct classification",
)

# Scored judge feedback
ze.send_feedback(
    prompt_slug="quality-scorer",
    completion_id="completion-uuid",
    thumbs_up=False,
    judge_id="judge-uuid",
    expected_score=3.5,
    score_direction="too_high",
    criteria_feedback={
        "accuracy": {"expected_score": 4.0, "reason": "Mostly correct"},
        "tone": {"expected_score": 1.0, "reason": "Too formal"},
    },
)

Feedback with reason and expected_output creates stronger training examples for prompt optimization.

Datasets & Evals

import zeroeval as ze

ze.init()

# Pull a dataset
dataset = ze.Dataset.pull("my-dataset")

for row in dataset:
    print(row.question, row.answer)

# Run an evaluation
@ze.task(outputs=["prediction"])
def solve(row):
    return {"prediction": llm_call(row["question"])}

@ze.evaluation(mode="row", outputs=["correct"])
def check(answer, prediction):
    return {"correct": int(answer == prediction)}

run = dataset.eval(solve, workers=8)
run = run.score([check], column_map={"answer": "answer", "prediction": "prediction"})

CLI

The zeroeval CLI covers monitoring, judges, prompts, optimization, datasets, and evals. Designed for both humans and automation (--output json).

Setup

zeroeval setup                              # Interactive setup
zeroeval auth set --api-key sk_ze_...       # Non-interactive
zeroeval auth show --redact                 # Show config

Global flags

Flag	Default	Description
`--output text\|json`	`text`	Output format. `json` emits stable JSON to stdout.
`--project-id`	auto from setup	Override project context.
`--api-base-url`	`https://api.zeroeval.com`	Override API URL.
`--timeout`	`60.0`	Request timeout in seconds.
`--quiet`	off	Suppress non-essential logs.

Monitoring

zeroeval sessions list --start-date 2025-01-01
zeroeval sessions get <session_id>
zeroeval traces list --start-date 2025-01-01
zeroeval traces get <trace_id>
zeroeval traces spans <trace_id>
zeroeval spans list --start-date 2025-01-01
zeroeval spans get <span_id>

Judges

zeroeval judges list
zeroeval judges get <judge_id>
zeroeval judges evaluations <judge_id> --limit 100
zeroeval judges criteria <judge_id>
zeroeval judges insights <judge_id>
zeroeval judges performance <judge_id>
zeroeval judges calibration <judge_id>
zeroeval judges versions <judge_id>

# Create a judge
zeroeval judges create \
  --name "answer-quality" \
  --prompt-file judge.txt \
  --evaluation-type binary \
  --sample-rate 1.0

# Submit feedback on a judge evaluation
zeroeval judges feedback create \
  --span-id <span_id> \
  --thumbs-up \
  --reason "Correct evaluation"

Prompts

zeroeval prompts list
zeroeval prompts get <slug> --version 3
zeroeval prompts versions <slug>
zeroeval prompts tags <slug>

# Submit feedback on a completion
zeroeval prompts feedback create \
  --prompt-slug support-triage \
  --completion-id <id> \
  --thumbs-down \
  --reason "Wrong classification"

Optimization

# Prompt optimization
zeroeval optimize prompt list <task_id>
zeroeval optimize prompt start <task_id> --optimizer-type quick_refine
zeroeval optimize prompt promote <task_id> <run_id> --yes

# Judge optimization
zeroeval optimize judge list <judge_id>
zeroeval optimize judge start <judge_id>
zeroeval optimize judge promote <judge_id> <run_id> --yes

Datasets & Evals

zeroeval datasets list
zeroeval datasets get <name>
zeroeval datasets versions <name>
zeroeval datasets rows <name> --version 3 --limit 200

zeroeval evals list --status completed
zeroeval evals get <eval_id>
zeroeval evals summary <eval_id>
zeroeval evals results <eval_id>
zeroeval evals scores <eval_id> <scorer_id>

Querying

List commands support --where, --select, and --order:

zeroeval judges list --where "name~quality" --select "id,name" --order "name:asc"

Machine-readable spec

zeroeval spec cli --format json
zeroeval spec command "judges create"

Development

uv run --group dev pytest tests/cli/ -v   # CLI tests
uv run --group dev pytest                  # Full suite

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

0.9.1

May 6, 2026

0.9.0

May 5, 2026

0.8.1

Apr 13, 2026

0.8.0

Apr 10, 2026

This version

0.7.3

Apr 9, 2026

0.7.2

Mar 20, 2026

0.7.1

Mar 19, 2026

0.7.0

Mar 11, 2026

0.6.136

Mar 10, 2026

0.6.135

Feb 16, 2026

0.6.134

Feb 13, 2026

0.6.133

Feb 2, 2026

0.6.132

Jan 27, 2026

0.6.131

Jan 27, 2026

0.6.130

Jan 27, 2026

0.6.129

Jan 22, 2026

0.6.128

Jan 17, 2026

0.6.127

Jan 16, 2026

0.6.126

Nov 27, 2025

0.6.125

Nov 26, 2025

0.6.124

Nov 26, 2025

0.6.123

Nov 24, 2025

0.6.122

Nov 23, 2025

0.6.121

Nov 13, 2025

0.6.120

Oct 16, 2025

0.6.119

Sep 2, 2025

0.6.118

Aug 31, 2025

0.6.117

Aug 31, 2025

0.6.116

Aug 31, 2025

0.6.115

Aug 30, 2025

0.6.114

Aug 27, 2025

0.6.113

Aug 25, 2025

0.6.112

Aug 25, 2025

0.6.111

Aug 24, 2025

0.6.110

Aug 24, 2025

0.6.109

Aug 24, 2025

0.6.108

Aug 24, 2025

0.6.107

Aug 24, 2025

0.6.106

Aug 24, 2025

0.6.105

Aug 24, 2025

0.6.104

Aug 24, 2025

0.6.103

Aug 24, 2025

0.6.102

Aug 24, 2025

0.6.101

Aug 24, 2025

0.6.100

Aug 22, 2025

0.6.99

Aug 20, 2025

0.6.98

Aug 20, 2025

0.6.97

Aug 16, 2025

0.6.96

Aug 15, 2025

0.6.95

Aug 4, 2025

0.6.16

Jan 15, 2026

0.6.15

Jan 15, 2026

0.6.14

Jan 15, 2026

0.6.13

Jan 15, 2026

0.6.11

Aug 4, 2025

0.6.9

Aug 3, 2025

0.6.8

Aug 2, 2025

0.6.7

Aug 2, 2025

0.6.6

Aug 2, 2025

0.6.5

Aug 2, 2025

0.6.2

Jan 16, 2026

0.6.0

Aug 1, 2025

0.5.1

Jul 29, 2025

0.5.0

Jul 28, 2025

0.4.0

Jul 28, 2025

0.2.17

Jul 27, 2025

0.2.16

Jul 27, 2025

0.2.15

Jul 24, 2025

0.2.14

Jul 24, 2025

0.2.13

Jul 24, 2025

0.2.12

Jul 24, 2025

0.2.11

Jul 24, 2025

0.2.10

Jul 24, 2025

0.2.9

Jul 8, 2025

0.2.8

Jun 26, 2025

0.2.7

Jun 26, 2025

0.2.6

Jun 26, 2025

0.2.5

Jun 26, 2025

0.2.4

Jun 25, 2025

0.2.3

Jun 25, 2025

0.2.2

Jun 25, 2025

0.2.1

Jun 21, 2025

0.2.0

Jun 21, 2025

0.1.9

Jun 20, 2025

0.1.8

Jun 19, 2025

0.1.7

Jun 16, 2025

0.1.6

Jun 6, 2025

0.1.5

Jun 6, 2025

0.1.4

Jun 6, 2025

0.1.3

Jun 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zeroeval-0.7.3.tar.gz (346.1 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zeroeval-0.7.3-py3-none-any.whl (187.4 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file zeroeval-0.7.3.tar.gz.

File metadata

Download URL: zeroeval-0.7.3.tar.gz
Upload date: Apr 9, 2026
Size: 346.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for zeroeval-0.7.3.tar.gz
Algorithm	Hash digest
SHA256	`91a8ed41b229399b1b657acfc33b5336d32d342687fd48be9e6b0a13b6318be8`
MD5	`99341aed92c34b5f40401813e08bb898`
BLAKE2b-256	`e8caffad573bd84df3e8cfb4c74c81f591ae6bcf770b693f2cda61f6504681b8`

See more details on using hashes here.

File details

Details for the file zeroeval-0.7.3-py3-none-any.whl.

File metadata

Download URL: zeroeval-0.7.3-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 187.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.12

File hashes

Hashes for zeroeval-0.7.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6052371a85e880c5231d0ce64967b776af19290f28323b30313e3d9dcbb2a2bb`
MD5	`774e867e56c16fc5dd8936e02607b89e`
BLAKE2b-256	`e1e603a76e591589434ed4cce86d2ee968b4f685b0f1d4acfd5dc4f7d8774c69`

See more details on using hashes here.

zeroeval 0.7.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

ZeroEval SDK

Quick Start

1. Setup

2. Trace your AI calls

3. Manage prompts

4. Inspect from the CLI

Installation

Authentication

Tracing

Prompts

Feedback

Datasets & Evals

CLI

Setup

Global flags

Monitoring

Judges

Prompts

Optimization

Datasets & Evals

Querying

Machine-readable spec

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes