Skip to main content

No project description provided

Project description

Patronus LLM Evaluation library

Patronus is a Python library developed by Patronus AI that provides a robust framework and utility functions for evaluating Large Language Models (LLMs). This library simplifies the process of running and scoring evaluations across different LLMs, making it easier for developers to benchmark model performance on various tasks.

Note: This library is currently in beta and is not stable. The APIs may change in future releases.

Note: This library requires Python 3.11 or greater.

Features

  • Modular Evaluation Framework: Easily plug in different models and evaluation/scoring mechanisms.
  • Seamless Integration with Patronus AI Platform: Effortlessly connect with the Patronus AI platform to run evaluations and export results.
  • Custom Evaluators: Use built-in evaluators, create your own based on various scoring methods, or leverage our state-of-the-art remote evaluators.

Documentation

For detailed documentation, including API references and advanced usage, please visit our documentation.

Installation

pip install patronus

Quickstart

import os
from patronus import Client, task, evaluator

client = Client(
    # This is the default and can be omitted
    api_key=os.environ.get("PATRONUSAI_API_KEY"),
)

@task
def hello_world_task(evaluated_model_input: str) -> str:
    return f"{evaluated_model_input} World"

@evaluator
def exact_match(evaluated_model_output: str, evaluated_model_gold_answer: str) -> bool:
    return evaluated_model_output == evaluated_model_gold_answer

client.experiment(
    "Tutorial Project",
    data=[
        {
            "evaluated_model_input": "Hello",
            "evaluated_model_gold_answer": "Hello World",
        },
    ],
    task=hello_world_task,
    evaluators=[exact_match],
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patronus-0.0.11.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

patronus-0.0.11-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file patronus-0.0.11.tar.gz.

File metadata

  • Download URL: patronus-0.0.11.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.11.tar.gz
Algorithm Hash digest
SHA256 2070f15ad23f4119d7bf6619bb318cf5b69efd6d4b37be74c8aa3320799c687e
MD5 f0fce40fcda8ab55c9b31d6f9de0987f
BLAKE2b-256 0897ca5a452f67a2665752e288aa0fa528dd18114b803246f1e5fc1a964e9cad

See more details on using hashes here.

File details

Details for the file patronus-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: patronus-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.4.0

File hashes

Hashes for patronus-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 393853ed195d55fb8344a5d32f36b2dcd5edf82f3d22c2250c135d3f68f5beeb
MD5 c655c5f0305b186c46250d0c54e460c0
BLAKE2b-256 8126f8730c9efa4e1d9b874c37b63c72fe929fa53d608e99fe4720dbb19f0d3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page