Python client for CAT Cafe - Continuous Alignment Testing platform for LLM observability
Project description
CAT Cafe SDK
Python SDK for CAT Cafe - Continuous Alignment Testing platform for LLM observability.
Installation
pip install cat-cafe-client
Quick Start
from cat.cafe.client import CATCafeClient, DatasetImport, DatasetExample
# Initialize the client
client = CATCafeClient(base_url="http://localhost:8000")
# Create a dataset
dataset = DatasetImport(
name="My Test Dataset",
description="Sample dataset for testing",
examples=[
DatasetExample(
input={"messages": [{"role": "user", "content": "What's the weather?"}]},
output={"messages": [{"role": "assistant", "content": "Weather info"}]},
metadata={"tags": ["weather"]},
)
]
)
# Import the dataset
result = client.import_dataset(dataset)
dataset_id = result["dataset"]["id"]
# Define a test function
def my_test_function(example):
# Your AI system logic here
messages = example.input.get("messages", []) if isinstance(example.input, dict) else []
user_question = messages[-1]["content"] if messages else ""
return f"Response to: {user_question}"
# Define evaluators
def accuracy_evaluator(actual_output, reference_output):
# Your evaluation logic here
expected_messages = reference_output.get("messages", []) if isinstance(reference_output, dict) else []
score = 0.8 # Example score
reason = "Good response"
return score, reason
# Run tests on the dataset
experiment_id = client.run_test_on_dataset(
dataset=dataset_id,
test_function=my_test_function,
evaluators=[accuracy_evaluator],
name="My Experiment",
description="Testing my AI system"
)
print(f"Experiment completed: {experiment_id}")
Core Classes
CATCafeClient
The main client for interacting with CAT Cafe:
client = CATCafeClient(base_url="http://localhost:8000")
Dataset Models
- DatasetImport: For creating new datasets with examples
- DatasetExample: Individual examples in a dataset
- Dataset: Structured dataset object returned from API
- Example: Individual example object returned from API
Experiment Models
- Experiment: Experiment configuration
- ExperimentResult: Results from running experiments
Key Methods
Dataset Operations
# Import a complete dataset
result = client.import_dataset(dataset_import)
# Fetch dataset as structured object
dataset = client.fetch_dataset(dataset_id)
# Find dataset by name
dataset = client.fetch_dataset_by_name("My Dataset")
# List all datasets
datasets = client.list_datasets()
Experiment Operations
# Run tests on a dataset (all-in-one)
experiment_id = client.run_test_on_dataset(
dataset=dataset_id,
test_function=my_test_func,
evaluators=[evaluator1, evaluator2],
name="My Test Run"
)
# Manual experiment workflow
experiment_id = client.start_experiment(experiment_config)
client.submit_results(experiment_id, results)
client.complete_experiment(experiment_id)
Test Functions
Test functions receive an Example object and should return a string output:
def my_test_function(example: Example) -> str:
# Access the input messages
messages = example.input.get("messages", []) if isinstance(example.input, dict) else list(example.input)
user_message = messages[-1]["content"] if messages else ""
# Your AI system logic here
response = call_my_ai_system(user_message)
return response
Evaluators
Evaluators receive the actual output and reference output payload, returning a score and reason:
def my_evaluator(actual_output: str, reference_output: list) -> tuple[float, str]:
# Your evaluation logic
if "correct_info" in actual_output:
return 1.0, "Contains correct information"
else:
return 0.0, "Missing correct information"
Advanced Usage
Experiment Runner
Experiment runner orchestration (listeners, tracing, caching, etc.) now lives in the cat-experiments package. Install and import cat.experiments if you need the higher-level runner utilities; this SDK now focuses on the core client and evaluation primitives.
Async Test Functions
async def async_test_function(example: Example) -> str:
# Async AI system call
response = await my_async_ai_system(example.input)
return response
# Note: Async functions work but have limitations in certain contexts
experiment_id = client.run_test_on_dataset(
dataset=dataset_id,
test_function=async_test_function,
name="Async Test"
)
Custom Metadata
def metadata_function(example: Example, output: str) -> dict:
return {
"response_length": len(output),
"example_tags": example.tags
}
experiment_id = client.run_test_on_dataset(
dataset=dataset_id,
test_function=my_test_function,
metadata_function=metadata_function,
name="Test with Metadata"
)
Manual Experiment Control
# Create experiment configuration
experiment_config = Experiment(
name="Manual Experiment",
description="Step-by-step experiment",
dataset_id=dataset_id,
tags=["manual", "testing"]
)
# Start experiment
experiment_id = client.start_experiment(experiment_config)
# Run your tests and collect results
results = []
dataset = client.fetch_dataset(dataset_id)
for example in dataset.examples:
output = my_test_function(example)
result = ExperimentResult(
example_id=example.id,
input_data={"input": example.input},
output=dict(example.output),
actual_output=output,
evaluation_scores={"manual_score": 0.8}
)
results.append(result)
# Submit results
client.submit_results(experiment_id, results)
# Complete experiment
client.complete_experiment(experiment_id, {"total_examples": len(results)})
Requirements
- Python 3.12+
- httpx
- CAT Cafe server running (default: http://localhost:8000)
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cat_cafe_client-0.0.1.tar.gz.
File metadata
- Download URL: cat_cafe_client-0.0.1.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
236a31214215cc8b0bc003334f559bdfa825a6be84fc950b17a924ba05b1c669
|
|
| MD5 |
a81d4d9303fddd1be7ca78e00412385b
|
|
| BLAKE2b-256 |
fe4a02c7b0ba9368efb7a7cb9edb5649234576a46aa85a3717d7fbb066072d97
|
File details
Details for the file cat_cafe_client-0.0.1-py3-none-any.whl.
File metadata
- Download URL: cat_cafe_client-0.0.1-py3-none-any.whl
- Upload date:
- Size: 12.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c81a4033e08bd694c53435d01dd8a8e0c32c6fac02df353b1b98d3d364163fe
|
|
| MD5 |
d2960790564078ac4ee257e7db2f92f0
|
|
| BLAKE2b-256 |
b984602f16b63abc04c94a6f0546243a92990b348e5238e77fe6918d89c1647a
|