Skip to main content

No project description provided

Project description

Patronus LLM Evaluation library

Patronus is a Python library developed by Patronus AI that provides a robust framework and utility functions for evaluating Large Language Models (LLMs). This library simplifies the process of running and scoring evaluations across different LLMs, making it easier for developers to benchmark model performance on various tasks.

Note: This library is currently in beta and is not stable. The APIs may change in future releases.

Note: This library requires Python 3.11 or greater.

Features

  • Modular Evaluation Framework: Easily plug in different models and evaluation/scoring mechanisms.
  • Seamless Integration with Patronus AI Platform: Effortlessly connect with the Patronus AI platform to run evaluations and export results.
  • Custom Evaluators: Use built-in evaluators, create your own based on various scoring methods, or leverage our state-of-the-art remote evaluators.

Documentation

For detailed documentation, including API references and advanced usage, please visit our documentation.

Installation

pip install patronus

Quickstart

import os
from patronus import Client, task, evaluator

client = Client(
    # This is the default and can be omitted
    api_key=os.environ.get("PATRONUSAI_API_KEY"),
)

@task
def hello_world_task(evaluated_model_input: str) -> str:
    return f"{evaluated_model_input} World"

@evaluator
def exact_match(evaluated_model_output: str, evaluated_model_gold_answer: str) -> bool:
    return evaluated_model_output == evaluated_model_gold_answer

client.experiment(
    "Tutorial Project",
    data=[
        {
            "evaluated_model_input": "Hello",
            "evaluated_model_gold_answer": "Hello World",
        },
    ],
    task=hello_world_task,
    evaluators=[exact_match],
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patronus-0.0.10.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

patronus-0.0.10-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file patronus-0.0.10.tar.gz.

File metadata

  • Download URL: patronus-0.0.10.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.10.tar.gz
Algorithm Hash digest
SHA256 a1f674aec4848aab15fe81e3ee69aab712f53159e76ebba5ac0ac0ed4a6e850a
MD5 29a88f2aa5ff9b5b5f4ceb1fac70bc73
BLAKE2b-256 3454db6c96f37c90c23140c060eaece8aa274775dc5b17c5decff290e524ee44

See more details on using hashes here.

File details

Details for the file patronus-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: patronus-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 55a1a6e9442559632d4c9c13a029c45ab0f99848db20ab3771d8c3dbb25a21fb
MD5 7f2731729b73b46d94eba6bc52ef284e
BLAKE2b-256 085110f3f07a5a6643398904a9f9cb07b4df5f47fb8a67131c0ea64eba3b47be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page