EvalKit Python SDK — LLM observability and tracing
Project description
EvalKit Python SDK
LLM observability and tracing for Python apps. One init() call auto-instruments
your LLM clients, HTTP calls, database queries, and logging — then streams traces to
Syntropy Labs.
Installation
pip install syntropylabs-evalkit
Optional provider extras:
pip install "syntropylabs-evalkit[openai]" # OpenAI
pip install "syntropylabs-evalkit[anthropic]" # Anthropic
pip install "syntropylabs-evalkit[all]" # everything
The PyPI package is
syntropylabs-evalkit, but you import it asevalkit.
Quickstart
import evalkit
evalkit.init(
subscription_key="sk_...", # your Syntropy Labs key
service_name="my-service",
)
# That's it — your OpenAI / Anthropic / HTTP / DB calls are now traced automatically.
from openai import OpenAI
client = OpenAI()
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
init() sets up auto-instrumentation for you. Context (including trace IDs)
propagates automatically across threads — no manual wiring required.
Web frameworks
# FastAPI / Starlette
from evalkit import EvalKitMiddleware
app.add_middleware(EvalKitMiddleware)
# Flask
import evalkit
evalkit.instrument_flask(app)
# Django — add to MIDDLEWARE
"evalkit.EvalKitDjangoMiddleware"
Manual spans
import evalkit
end, ctx = evalkit.start_span("my-operation", {"key": "value"})
try:
... # your work
finally:
end("ok")
# Or as a decorator
@evalkit.trace_function()
def do_work(x):
return x * 2
SQLAlchemy
import evalkit
evalkit.patch_sqlalchemy_engine(engine)
Evaluation
Score agent outputs locally — no judge-model cost, results appear as eval_result spans:
import evalkit
scores = evalkit.evaluate(
output="Your return window is 30 days.",
input="What is the return policy?",
expected_tools=["search_knowledge_base"],
tool_calls=[{"name": "search_knowledge_base"}],
constraints={"required_terms": ["return", "30"]},
)
# → {"tool_trajectory_f1": 1.0, "required_terms": 1.0, ...}
Scenario simulation
Generate realistic synthetic-user scenarios from your agent's system prompt and tool list, then run each scenario against your real agent and score the results automatically:
import evalkit
evalkit.init(subscription_key="tk_live_...", service_name="my-agent")
# Step 1 — generate scenarios server-side (BYOK: your own key for the generation call)
scenarios = evalkit.generate_scenarios(
agent_instructions=SYSTEM_PROMPT,
tools=["search_kb", "lookup_order", "create_ticket"],
count=5,
provider="anthropic", # "openai" or "google" also supported
api_key="sk-ant-...", # BYOK key for generation model
model="claude-haiku-4-5-20251001",
)
# Step 2 — simulate each scenario against your real agent and score it
def entrypoint(ctx: evalkit.SimContext) -> evalkit.AgentTurnResult:
# ctx.message — the synthetic user's turn message
# ctx.session_id — stable per-scenario, use it to keep multi-turn context
reply, tools_used = run_my_agent(ctx.session_id, ctx.message)
return evalkit.AgentTurnResult(
text=reply,
tool_calls=[{"name": t} for t in tools_used],
)
report = evalkit.simulate_user(entrypoint, scenarios, tags=["ci"])
# Results appear in Dashboard → Simulations
print("Simulation ID:", report["simulation_id"])
Out-of-process agents (Claude Agent SDK)
The Claude Agent SDK runs the Anthropic call in a subprocess, so the normal in-process patch can't observe it. EvalKit wraps claude_agent_sdk.query() and ClaudeSDKClient.receive_response() instead, reading token/cost/latency from the ResultMessage the SDK already returns. This happens automatically via init() when claude_agent_sdk is installed. To call it explicitly:
evalkit.patch_claude_agent_sdk()
Flushing
Traces are batched and exported in the background. Flush before exit if needed:
evalkit.flush()
Links
- Website: https://syntropylabs.ai
- Documentation: https://syntropylabs.ai/docs
License
Proprietary — © 2026 Syntropy Labs. All rights reserved. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file syntropylabs_evalkit-0.1.22.tar.gz.
File metadata
- Download URL: syntropylabs_evalkit-0.1.22.tar.gz
- Upload date:
- Size: 51.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c11325ade26e25c7a311c6f62d302f357fecf02f19fea41e33f49399a2b41a76
|
|
| MD5 |
8b43ff2bfa0f2d0da46eeb4bb10964db
|
|
| BLAKE2b-256 |
1a319f0f85cbe6652da924e2c9b4bd9a261aa95aa2a8197876e2dee1f635f63b
|
Provenance
The following attestation bundles were made for syntropylabs_evalkit-0.1.22.tar.gz:
Publisher:
publish.yml on Syntropylabs-ai/evalkit_sdk_py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
syntropylabs_evalkit-0.1.22.tar.gz -
Subject digest:
c11325ade26e25c7a311c6f62d302f357fecf02f19fea41e33f49399a2b41a76 - Sigstore transparency entry: 1746266948
- Sigstore integration time:
-
Permalink:
Syntropylabs-ai/evalkit_sdk_py@7585a433e996ce09914d2e61c54d7cde12904f26 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Syntropylabs-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7585a433e996ce09914d2e61c54d7cde12904f26 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file syntropylabs_evalkit-0.1.22-py3-none-any.whl.
File metadata
- Download URL: syntropylabs_evalkit-0.1.22-py3-none-any.whl
- Upload date:
- Size: 85.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68a9a76eaa965392ec1893f7738a7e91447c8ca037b3fb0eb8aebc0d3056f9e7
|
|
| MD5 |
3ccefdd24a3b488da012c413d3510aaa
|
|
| BLAKE2b-256 |
a4a32636d125aa38621934ac113134ca4215383e92fdb062568531e9aef4548a
|
Provenance
The following attestation bundles were made for syntropylabs_evalkit-0.1.22-py3-none-any.whl:
Publisher:
publish.yml on Syntropylabs-ai/evalkit_sdk_py
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
syntropylabs_evalkit-0.1.22-py3-none-any.whl -
Subject digest:
68a9a76eaa965392ec1893f7738a7e91447c8ca037b3fb0eb8aebc0d3056f9e7 - Sigstore transparency entry: 1746267028
- Sigstore integration time:
-
Permalink:
Syntropylabs-ai/evalkit_sdk_py@7585a433e996ce09914d2e61c54d7cde12904f26 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Syntropylabs-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7585a433e996ce09914d2e61c54d7cde12904f26 -
Trigger Event:
workflow_dispatch
-
Statement type: