Python SDK for the evaluate LLM evaluation framework
Project description
llmeval - Python SDK for evaluate
download the evaluate server from https://github.com/RGGH/evaluate
A Python client library for the evaluate LLM evaluation framework.
Installation
pip install -e .
For development with all extras:
pip install -e ".[dev]"
Quick Start
from llmeval import EvalClient
# Initialize the client
client = EvalClient(base_url="http://127.0.0.1:8080")
# Check server health
status = client.health_check()
print(status)
# Get available models
models = client.get_models()
print(f"Available models: {models}")
# Run a single evaluation
result = client.run_eval(
model="gemini:gemini-2.5-pro",
prompt="What is the capital of France?",
expected="Paris",
judge_model="gemini:gemini-2.5-pro"
)
print(f"Model output: {result.model_output}")
print(f"Judge verdict: {result.judge_verdict}")
print(f"Passed: {result.passed}")
Features
- ✅ Simple, intuitive API
- ✅ Type-safe with Pydantic models
- ✅ Batch evaluation support
- ✅ Real-time WebSocket streaming
- ✅ Jupyter notebook integration
- ✅ pandas DataFrame utilities
- ✅ Comprehensive error handling
- ✅ Context manager support
Documentation
See the examples/ directory for more usage examples:
basic_usage.py- Simple examplesadvanced_usage.py- Advanced patternsstreaming_example.py- WebSocket streamingjupyter_example.ipynb- Jupyter notebook
Requirements
- Python 3.8+
- requests
- pydantic
- websockets
- pandas
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llmeval-sdk-0.1.5.tar.gz
(6.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmeval-sdk-0.1.5.tar.gz.
File metadata
- Download URL: llmeval-sdk-0.1.5.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62b2eae53da0d13fed1a5881c562f17c03b0fd25cb3ebf7343220f1c78c0d2ad
|
|
| MD5 |
2b59915ec8905bbab0bc2e850bd1cb1d
|
|
| BLAKE2b-256 |
bd949cd7d00303786a664ab93a1597fefd597f2d95c3b278617e53b1c61e2985
|
File details
Details for the file llmeval_sdk-0.1.5-py3-none-any.whl.
File metadata
- Download URL: llmeval_sdk-0.1.5-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
926af474ff53ee1a7b63d0003985b42590c34796cb6f58322c20daf1189a3772
|
|
| MD5 |
e02173bc412d2df5361b6e3c80116791
|
|
| BLAKE2b-256 |
3831a2642ae96d27fa82bfc2a22eec073e8df47c3730480cdbf5ccca6d5fc725
|