Skip to main content

No project description provided

Project description

Patronus LLM Evaluation library

Patronus is a Python library developed by Patronus AI that provides a robust framework and utility functions for evaluating Large Language Models (LLMs). This library simplifies the process of running and scoring evaluations across different LLMs, making it easier for developers to benchmark model performance on various tasks.

Note: This library is currently in beta and is not stable. The APIs may change in future releases.

Note: This library requires Python 3.11 or greater.

Features

  • Modular Evaluation Framework: Easily plug in different models and evaluation/scoring mechanisms.
  • Seamless Integration with Patronus AI Platform: Effortlessly connect with the Patronus AI platform to run evaluations and export results.
  • Custom Evaluators: Use built-in evaluators, create your own based on various scoring methods, or leverage our state-of-the-art remote evaluators.

Documentation

For detailed documentation, including API references and advanced usage, please visit our documentation.

Installation

pip install patronus

Quickstart

import os
from patronus import Client, task, evaluator

client = Client(
    # This is the default and can be omitted
    api_key=os.environ.get("PATRONUSAI_API_KEY"),
)

@task
def hello_world_task(evaluated_model_input: str) -> str:
    return f"{evaluated_model_input} World"

@evaluator
def exact_match(evaluated_model_output: str, evaluated_model_gold_answer: str) -> bool:
    return evaluated_model_output == evaluated_model_gold_answer

client.experiment(
    "Tutorial Project",
    data=[
        {
            "evaluated_model_input": "Hello",
            "evaluated_model_gold_answer": "Hello World",
        },
    ],
    task=hello_world_task,
    evaluators=[exact_match],
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patronus-0.0.9.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

patronus-0.0.9-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file patronus-0.0.9.tar.gz.

File metadata

  • Download URL: patronus-0.0.9.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.9.tar.gz
Algorithm Hash digest
SHA256 3305fd3bf1080ce2781df61df398415297afecfe5c8210f9da53a982753a54b2
MD5 2c5102af0661594c68a9e7c01d5578c2
BLAKE2b-256 dd4195c332067399da946b7456333c8b35b3c87848030414e68ed1e612b53ccd

See more details on using hashes here.

File details

Details for the file patronus-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: patronus-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 968d1e8bc15b2326b7c4d3916ea7fba187e81ba3a994c4bde360364a503a9469
MD5 d8834cf0ad4742df1b6e7df9d95a0f26
BLAKE2b-256 fc44a60ad9619e8bcd8a5c3a9f78ab82f8ca674d86e1df583ee9c07d65d7be04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page