Python SDK for the evaluate LLM evaluation framework
Project description
llmeval - Python SDK for evaluate
download the evaluate server from https://github.com/RGGH/evaluate
A Python client library for the evaluate LLM evaluation framework.
Installation
pip install llmeval-sdk
pip install -e .
For development with all extras:
pip install -e ".[dev]"
Quick Start
from llmeval import EvalClient
# Initialize the client
client = EvalClient(base_url="http://127.0.0.1:8080")
# Check server health
status = client.health_check()
print(status)
# Get available models
models = client.get_models()
print(f"Available models: {models}")
# Run a single evaluation
result = client.run_eval(
model="gemini:gemini-2.5-pro",
prompt="What is the capital of France?",
expected="Paris",
judge_model="gemini:gemini-2.5-pro"
)
print(f"Model output: {result.model_output}")
print(f"Judge verdict: {result.judge_verdict}")
print(f"Passed: {result.passed}")
Features
- ✅ Simple, intuitive API
- ✅ Type-safe with Pydantic models
- ✅ Batch evaluation support
- ✅ Real-time WebSocket streaming
- ✅ Jupyter notebook integration
- ✅ pandas DataFrame utilities
- ✅ Comprehensive error handling
- ✅ Context manager support
Documentation
https://github.com/RGGH/llmeval-python-sdk/blob/main/examples/evaluate.ipynb
Requirements
- Python 3.8+
- requests
- pydantic
- websockets
- pandas
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llmeval_sdk-0.1.7.tar.gz
(7.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmeval_sdk-0.1.7.tar.gz.
File metadata
- Download URL: llmeval_sdk-0.1.7.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e694dc7090688d642bc5b5093bef31420e551175bd3eacf8209e887f1fa298d
|
|
| MD5 |
2ea57504c385068f13f58506b7fccc69
|
|
| BLAKE2b-256 |
1eaa8c509cd567d56cc19b84324bbb55fddec563d9a4fcd3628cd8c44dbacad2
|
File details
Details for the file llmeval_sdk-0.1.7-py3-none-any.whl.
File metadata
- Download URL: llmeval_sdk-0.1.7-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c8b41160d4db0c7df945ba37307fda1595c933b1046a6d51ffbd27b99fed7df
|
|
| MD5 |
0a85b1993eb473e6305bf06761d7b342
|
|
| BLAKE2b-256 |
009f742ac75a6e268624903e02c722e7e6ef6ad99fddfd398e23dfe052181d56
|