HB-Eval SDK for reliable agent evaluation, semantic memory, and LangChain integration
Project description
HB-Eval SDK
The official Python SDK for HB-Eval OS — the reliability operating system for agentic AI. Evaluate any agent trajectory against five reliability metrics and receive a tier certification, in a few lines of code.
Install
pip install hb-eval-sdk
For the LangChain integration:
pip install hb-eval-sdk[langchain]
Quick start
from hb_eval_sdk import HBEvalClient
client = HBEvalClient(
api_key="...", # identifies your project
aes_key="...", # encrypts your payload (base64, 32 bytes)
signing_secret="...", # signs your request (base64; never transmitted)
)
result = client.evaluate({
"trajectory": [
{"step": 1, "action": "chain_start"},
{"step": 2, "action": "tool_call", "tool": "search"},
{"step": 3, "action": "chain_end"},
],
"sub_tasks": 3,
"constraint_violations": 0,
"recovery_episodes": [],
"agent_id": "my-agent",
})
print(result.verdict, result.tier)
print(result.metrics) # pei, irs, frr, ti, csi
The five metrics
Every evaluation returns five reliability metrics. Any of them may be None
when it is genuinely undefined for a given run, and None always means
"not measured" — never "scored zero".
- PEI — Planning Efficiency Index
- IRS — Intentional Recovery Score (None when the run had no faults)
- FRR — Failure Resilience Rate
- TI — Traceability Index (None when no judge evaluation was made)
- CSI — Consistency Stability Index (None without enough history)
LangChain
from hb_eval_sdk import HBEvalCallback
callback = HBEvalCallback(api_key="...", aes_key="...", signing_secret="...")
agent.run(task, callbacks=[callback])
print(callback.last_result.verdict)
The callback observes the real run — counting genuine tool errors and detecting actual fault-and-recovery patterns — rather than assuming a clean execution.
Credentials
Your project has three credentials, issued together when the project is created. The API key is sent on each request to identify you. The AES key encrypts your payload locally. The signing secret signs your request and is never transmitted — it proves the request genuinely came from you, even to an observer who has seen your API key.
Links
- Documentation: https://github.com/hb-evalSystem/HB-System/blob/main/docs
- Repository: https://github.com/hb-evalSystem/HB-System
- Platform: https://hbeval.com
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hb_eval_sdk-2.1.0.tar.gz.
File metadata
- Download URL: hb_eval_sdk-2.1.0.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c8ead5bf5fe1d269bfe31e7aa3ab76ac2be5d828794afb99aa2e1a89855aecb
|
|
| MD5 |
3ccdde366cd45fb43eb0cf63e2720f5c
|
|
| BLAKE2b-256 |
5a01e1af48bca5e2da64c75a4ff98d73e5dafae604fe9bb2bff5b7ad9a9cff9d
|
File details
Details for the file hb_eval_sdk-2.1.0-py3-none-any.whl.
File metadata
- Download URL: hb_eval_sdk-2.1.0-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
646b8b3828e89b13c7031cd62c8312770fd95e87c759351ba5a1d015cda30fe0
|
|
| MD5 |
9425ef617160b0a1ae9aa58d252d116f
|
|
| BLAKE2b-256 |
96b1b5331f777d3361edfdc2904602655f3e8fce63b4a741f849df7b75210e1e
|