Patronus Python SDK
Project description
Patronus Python SDK
The Patronus Python SDK is a Python library for systematic evaluation of Large Language Models (LLMs). Build, test, and improve your LLM applications with customizable tasks, evaluators, and comprehensive experiment tracking.
Note: This library is currently in beta and is not stable. The APIs may change in future releases.
Documentation
For detailed documentation, including API references and advanced usage, please visit our documentation.
Installation
pip install patronus
Quickstart
Tracing
import patronus
patronus.init()
# Wrap function with @traced() decorator.
@patronus.traced()
def main():
perform()
def perform():
# Or use context start_span context manager.
with patronus.start_span("Performing action"):
# Do work
...
Custom evaluations
from patronus import init
from patronus import evaluator
init()
@evaluator
def iexact_match(actual: str, expected: str) -> bool:
return actual.lower().strip() == expected.lower().strip()
def main():
iexact_match("bonne nuit", "Bonne nuit")
Patronus evaluations
from patronus import init
from patronus import RemoteEvaluator
init()
check_hallucinates = RemoteEvaluator("lynx", "patronus:hallucination")
resp = check_hallucinates.evaluate(
task_input="What is the car insurance policy?",
task_context=(
"""
To qualify for our car insurance policy, you need a way to show competence
in driving which can be accomplished through a valid driver's license.
You must have multiple years of experience and cannot be graduating from driving school before or on 2028.
"""
),
task_output="To even qualify for our car insurance policy, you need to have a valid driver's license that expires later than 2028."
)
print(resp.model_dump_json(indent=4))
Experiments
The Patronus Python SDK includes a powerful experimentation framework designed to help you evaluate, compare, and improve your AI models. Whether you're working with pre-trained models, fine-tuning your own, or experimenting with new architectures, this framework provides the tools you need to set up, execute, and analyze experiments efficiently.
from patronus.evals import evaluator, RemoteEvaluator
from patronus.experiments import run_experiment, Row, TaskResult, FuncEvaluatorAdapter
def my_task(row: Row, **kwargs):
return f"{row.task_input} World"
# Reference remote Judge Patronus Evaluator with is-concise criteria.
# This evaluator runs remotely on Patronus infrastructure.
is_concise = RemoteEvaluator("judge", "patronus:is-concise")
@evaluator()
def exact_match(row: Row, task_result: TaskResult, **kwargs):
print(f"{task_result.output=} :: {row.task_output=}")
return task_result.output == row.task_output
result = run_experiment(
project_name="Tutorial Project",
dataset=[
{
"task_input": "Hello",
"gold_answer": "Hello World",
},
],
task=my_task,
evaluators=[is_concise, FuncEvaluatorAdapter(exact_match)],
)
result.to_csv("./experiment.csv")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file patronus-0.1.0rc1.tar.gz.
File metadata
- Download URL: patronus-0.1.0rc1.tar.gz
- Upload date:
- Size: 38.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.0 CPython/3.12.8 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6801233cf2ef3e9fe976dea78bf7e49f915e42be31741b100b9881c8db93069f
|
|
| MD5 |
2a17f9c636655c46b307c61b7a232b6f
|
|
| BLAKE2b-256 |
8a2d7624101da2ccff09a015ca49bb83cbbc55a3bcfaeed50e02a9e52fb66688
|
File details
Details for the file patronus-0.1.0rc1-py3-none-any.whl.
File metadata
- Download URL: patronus-0.1.0rc1-py3-none-any.whl
- Upload date:
- Size: 49.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.0 CPython/3.12.8 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e197f80be33d788164bfe91e3b323b34177709d0ae8b9d34d34a12e2dc6bf5d7
|
|
| MD5 |
8c936edfe1add06b3cddbaaf8dae9986
|
|
| BLAKE2b-256 |
f9422d60a34506b29e486cdc9c5a978639902c34aa91414b7cff1df4c4d6b860
|