Skip to main content

Patronus Python SDK

Project description

Patronus Python SDK

PyPI version Documentation


SDK Documentation: https://patronus-ai.github.io/patronus-py

Platform Documentation: https://docs.patronus.ai


The Patronus Python SDK is a Python library for systematic evaluation of Large Language Models (LLMs). Build, test, and improve your LLM applications with customizable tasks, evaluators, and comprehensive experiment tracking.

Documentation

For detailed documentation, including API references and advanced usage, please visit our documentation.

Installation

pip install patronus

Quickstart

Initialization

import patronus

# Initialize with your Patronus API key
patronus.init(
    project_name="My Agent",  # Optional, defaults to "Global"
    api_key="your-api-key"      # Optional, can also be set via environment variable
)

You can also use a configuration file (patronus.yaml) for initialization:

# patronus.yaml
api_key: "your-api-key"
project_name: "My Agent"

With this configuration file in your working directory, you can simply call:

import patronus
patronus.init()  # Automatically loads config from patronus.yaml

Tracing

import patronus

patronus.init()

# Trace a function with the @traced decorator
@patronus.traced()
def process_input(user_query):
    # Process the input
    return "Processed: " + user_query

# Use context manager for finer-grained tracing
def complex_operation():
    with patronus.start_span("Data preparation"):
        # Prepare data
        pass

    with patronus.start_span("Model inference"):
        # Run model
        pass

Patronus evaluations

from patronus import init
from patronus import RemoteEvaluator

init()

check_hallucinates = RemoteEvaluator("lynx", "patronus:hallucination")

resp = check_hallucinates.evaluate(
    task_input="What is the car insurance policy?",
    task_context=(
        """
        To qualify for our car insurance policy, you need a way to show competence
        in driving which can be accomplished through a valid driver's license.
        You must have multiple years of experience and cannot be graduating from driving school before or on 2028.
        """
    ),
    task_output="To even qualify for our car insurance policy, you need to have a valid driver's license that expires later than 2028."
)
print(f"""
Hallucination evaluation:
Passed: {resp.pass_}
Score: {resp.score}
Explanation: {resp.explanation}
""")

User-Defined Evaluators

from patronus import init, evaluator
from patronus.evals import EvaluationResult

init()

# Simple evaluator function
@evaluator()
def exact_match(actual: str, expected: str) -> bool:
    return actual.strip() == expected.strip()

# More complex evaluator with detailed result
@evaluator()
def semantic_match(actual: str, expected: str) -> EvaluationResult:
    similarity = calculate_similarity(actual, expected)  # Your similarity function
    return EvaluationResult(
        score=similarity,
        pass_=similarity > 0.8,
        text_output="High similarity" if similarity > 0.8 else "Low similarity",
        explanation=f"Calculated similarity: {similarity}"
    )

# Use the evaluators
result = exact_match("Hello world", "Hello world")
print(f"Match: {result}")  # Output: Match: True

Running Experiments

The Patronus Python SDK includes a powerful experimentation framework designed to help you evaluate, compare, and improve your AI models. Whether you're working with pre-trained models, fine-tuning your own, or experimenting with new architectures, this framework provides the tools you need to set up, execute, and analyze experiments efficiently.

from patronus.evals import evaluator, RemoteEvaluator
from patronus.experiments import run_experiment, Row, TaskResult, FuncEvaluatorAdapter


def my_task(row: Row, **kwargs):
    return f"{row.task_input} World"


# Reference remote Judge Patronus Evaluator with is-concise criteria.
# This evaluator runs remotely on Patronus infrastructure.
is_concise = RemoteEvaluator("judge", "patronus:is-concise")


@evaluator()
def exact_match(row: Row, task_result: TaskResult, **kwargs):
    return task_result.output == row.task_output


result = run_experiment(
    project_name="Tutorial Project",
    dataset=[
        {
            "task_input": "Hello",
            "gold_answer": "Hello World",
        },
    ],
    task=my_task,
    evaluators=[is_concise, FuncEvaluatorAdapter(exact_match)],
)

result.to_csv("./experiment.csv")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patronus-0.1.20.tar.gz (356.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

patronus-0.1.20-py3-none-any.whl (79.2 kB view details)

Uploaded Python 3

File details

Details for the file patronus-0.1.20.tar.gz.

File metadata

  • Download URL: patronus-0.1.20.tar.gz
  • Upload date:
  • Size: 356.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for patronus-0.1.20.tar.gz
Algorithm Hash digest
SHA256 1964ab007ee551a377fd4a973f7f7ef518d4223fa563135f7ce6924993f0f1d5
MD5 789c9d34a5740f000c4ecd13eace1301
BLAKE2b-256 eba026412d13a5861d4cd3df4b92d24fea896118973aa06c61b64fcf44742034

See more details on using hashes here.

Provenance

The following attestation bundles were made for patronus-0.1.20.tar.gz:

Publisher: release.yaml on patronus-ai/patronus-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file patronus-0.1.20-py3-none-any.whl.

File metadata

  • Download URL: patronus-0.1.20-py3-none-any.whl
  • Upload date:
  • Size: 79.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for patronus-0.1.20-py3-none-any.whl
Algorithm Hash digest
SHA256 f9c8b4cdf3e3994c07c27e3ba8e9f7e7f5c7aa83b5afb7922b375195a9217dcd
MD5 df4b404190bc210452d9f2c39c19e02f
BLAKE2b-256 48ebc8938b77cd2d03bcc4a11f2e722458d65526fbb8fd3aa8bcbacfd3f2f4ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for patronus-0.1.20-py3-none-any.whl:

Publisher: release.yaml on patronus-ai/patronus-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page