Skip to main content

Python SDK for the Verdikt Evaluation API

Project description

verdikt-sdk

Python SDK for Verdikt — a standalone AI evaluation service that decouples evaluation and LLM/human judging from the application being evaluated.

Installation

pip install verdikt-sdk

Usage

from verdikt_sdk import AnswerWithCost, VerdiktClient, EvaluationType, Question
from yalc import LLMModel

client = VerdiktClient(
    base_url="https://your-verdikt-instance.com",
    client_id="your-client-id",
    client_secret="your-client-secret",
)

# Register your app (idempotent — safe to call on every deploy)
await client.create_app(slug="my-app", name="My App")

# Sync questions to the dataset (idempotent)
await client.add_questions("my-app", [
    Question(question="What is the capital of France?", human_answer="Paris"),
])

# Your callback returns the answer plus the cost it took your app to produce it.
# `cost` is optional — pass None when you do not track it.
async def my_llm_function(question: str) -> AnswerWithCost:
    answer, cost = await my_app(question)
    return AnswerWithCost(answer=answer, cost=cost)

# Run an evaluation cycle
await client.run_evaluation(
    app_slug="my-app",
    app_version="v1.2.0",
    callback=my_llm_function,
    evaluation_type=EvaluationType.LLM_ONLY,
    llm_judge_models=[LLMModel.gpt_4o_mini],
)

run_evaluation calls your callback concurrently for every question in the dataset, then submits all answers to Verdikt for judgment.

Breaking change in 0.2.0: the callback now returns AnswerWithCost(answer=..., cost=...) instead of a bare str. Callers on 0.1.x must wrap their return value (return AnswerWithCost(answer=ans) is a drop-in equivalent of the old behaviour).

Authentication

The SDK authenticates via Zitadel OAuth2 client credentials. Create a machine user in your Zitadel project and pass its client_id and client_secret to EvaluationClient.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

verdikt_sdk-0.2.0.tar.gz (110.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

verdikt_sdk-0.2.0-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file verdikt_sdk-0.2.0.tar.gz.

File metadata

  • Download URL: verdikt_sdk-0.2.0.tar.gz
  • Upload date:
  • Size: 110.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for verdikt_sdk-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8d35153a30e3fb0125b93a7b99f02baacb7ae45bf51d1503e6026896c6546c6b
MD5 a62a924f7b4d8ba521fb01657a6eed9d
BLAKE2b-256 008cf40718abbe4c84efe57d641c9dd334aa28f5129fdac78a63c9fdb849571f

See more details on using hashes here.

Provenance

The following attestation bundles were made for verdikt_sdk-0.2.0.tar.gz:

Publisher: publish.yml on cognitai-labs-dev/verdikt-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file verdikt_sdk-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: verdikt_sdk-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for verdikt_sdk-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 83d37d49be2734113da56d553e25009f35fc03d6de6e49b64bcd2b7678b93854
MD5 d5af4a3196369b77a5a3bac1b05917ad
BLAKE2b-256 cd3938da728cb5bcda4c0a62bb76f16654f8d7dfc031865353f6740213518049

See more details on using hashes here.

Provenance

The following attestation bundles were made for verdikt_sdk-0.2.0-py3-none-any.whl:

Publisher: publish.yml on cognitai-labs-dev/verdikt-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page