Skip to main content

No project description provided

Project description

Patronus LLM Evaluation library

Patronus is a Python library developed by Patronus AI that provides a robust framework and utility functions for evaluating Large Language Models (LLMs). This library simplifies the process of running and scoring evaluations across different LLMs, making it easier for developers to benchmark model performance on various tasks.

Note: This library is currently in beta and is not stable. The APIs may change in future releases.

Note: This library requires Python 3.11 or greater.

Features

  • Modular Evaluation Framework: Easily plug in different models and evaluation/scoring mechanisms.
  • Seamless Integration with Patronus AI Platform: Effortlessly connect with the Patronus AI platform to run evaluations and export results.
  • Custom Evaluators: Use built-in evaluators, create your own based on various scoring methods, or leverage our state-of-the-art remote evaluators.

Documentation

For detailed documentation, including API references and advanced usage, please visit our documentation.

Installation

pip install patronus

Quickstart

import os
from patronus import Client, task, evaluator

client = Client(
    # This is the default and can be omitted
    api_key=os.environ.get("PATRONUSAI_API_KEY"),
)

@task
def hello_world_task(evaluated_model_input: str) -> str:
    return f"{evaluated_model_input} World"

@evaluator
def exact_match(evaluated_model_output: str, evaluated_model_gold_answer: str) -> bool:
    return evaluated_model_output == evaluated_model_gold_answer

client.experiment(
    "Tutorial Project",
    data=[
        {
            "evaluated_model_input": "Hello",
            "evaluated_model_gold_answer": "Hello World",
        },
    ],
    task=hello_world_task,
    evaluators=[exact_match],
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patronus-0.0.8.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

patronus-0.0.8-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file patronus-0.0.8.tar.gz.

File metadata

  • Download URL: patronus-0.0.8.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.8.tar.gz
Algorithm Hash digest
SHA256 580347a33a8bcd71621292dc0c51b822fb04d57c67b95c07e0e2bc80cbdcbbc7
MD5 bf91ce2896bd7b7f0b91c6a729b1c88f
BLAKE2b-256 722ba306bd2f850c29077c265790655400326ea8b6e8dc82cf2a482abf3900b4

See more details on using hashes here.

File details

Details for the file patronus-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: patronus-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 3197c2d746130cc0e29d0aa160f9c8bec4f81f1c312b4ebfb0f922eb7ca83093
MD5 72cd55090d327ef8953991ad785e56a6
BLAKE2b-256 efc18ff11933905760d710b25d8849da9089a4459acb44c7e778584b61ec5048

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page