Offline-first, framework-agnostic LLM evaluation primitives.
Project description
llmcalibre
llmcalibre is an offline-first, framework-agnostic library for evaluating LLM outputs. It starts with deterministic local checks, then lets you opt into heavier NLP metrics or OpenAI-compatible judge models only when you need them.
Installation
Base install, with no runtime dependencies:
pip install llmcalibre
Install from a local checkout for development:
pip install -e ".[dev]"
Optional extras:
pip install "llmcalibre[schema]"
pip install "llmcalibre[nlp]"
pip install "llmcalibre[judge]"
Extras can be combined:
pip install "llmcalibre[schema,nlp,judge]"
Features
- Normalized
EvalResultobjects with score, pass/fail status, rationale, and metadata. EvalPipelinefor running multiple evaluators against one output.- Heuristic checks for JSON format, length, required/forbidden terms, regex patterns, and JSON Schema.
- Optional offline NLP metrics for semantic similarity and ROUGE.
- Optional OpenAI-compatible LLM judge evaluator with graceful failure handling.
- Stdlib CLI for simple terminal checks.
- Lightweight pytest assertion helper.
- No base runtime dependencies.
Python API
from llmcalibre import (
ContainsChecker,
EvalPipeline,
FormatChecker,
LengthConstraint,
RegexChecker,
)
pipeline = EvalPipeline(
[
FormatChecker(format="json"),
LengthConstraint(min_chars=2, max_chars=100),
ContainsChecker(required=["answer"], forbidden=["TODO"]),
RegexChecker(required_patterns=[r'"answer"\s*:']),
]
)
results = pipeline.run('{"answer": "Paris"}')
summary = pipeline.summary(results)
print(results)
print(summary)
Optional Evaluators
JSON Schema validation:
from llmcalibre import JsonSchemaChecker
schema = {
"type": "object",
"required": ["answer"],
"properties": {"answer": {"type": "string"}},
}
result = JsonSchemaChecker(schema=schema).evaluate('{"answer": "Paris"}')
print(result)
Offline NLP metrics:
from llmcalibre import RougeScore, SemanticSimilarity
similarity = SemanticSimilarity(threshold=0.7)
semantic_result = similarity.evaluate(
"Paris is the capital of France.",
reference="France's capital city is Paris.",
)
rouge = RougeScore(rouge_type="rougeL", threshold=0.5)
rouge_result = rouge.evaluate(
"Paris is the capital of France.",
reference="The capital of France is Paris.",
)
OpenAI-compatible judge:
from llmcalibre import OpenAIJudge
judge = OpenAIJudge(model="gpt-4o-mini")
result = judge.evaluate(
"Paris is the capital of France.",
prompt="What is the capital of France?",
reference="Paris",
criteria="Reward factual correctness and concise answers.",
)
print(result)
CLI Usage
llmcalibre check --output '{"name":"Emon"}' --format json
llmcalibre check --output "Paris is in France" --contains Paris --contains France
llmcalibre check --output-file response.txt --min-chars 50 --max-chars 500
llmcalibre check --output "Date: 2026-06-16" --regex "\\d{4}-\\d{2}-\\d{2}"
The CLI exits with:
0when all checks pass.1when at least one check fails.2for usage or configuration errors.
Pytest Helper
from llmcalibre import ContainsChecker, FormatChecker
from llmcalibre.pytest import assert_eval
def test_llm_response():
output = '{"city": "Paris", "country": "France"}'
assert_eval(
output,
evaluators=[
FormatChecker(format="json"),
ContainsChecker(required=["Paris", "France"]),
],
)
License
MIT
Release Process
- Update the version in
pyproject.toml. - Update
CHANGELOG.md. - Merge the release changes to
main. - Create a GitHub pre-release with a tag like
v0.1.0-alpha.1. - The release workflow publishes the package to TestPyPI automatically.
- Test install from TestPyPI:
pip install --index-url https://test.pypi.org/simple/ llmcalibre
- Create a normal GitHub Release for the final version.
- The release workflow publishes the package to PyPI automatically.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmcalibre-0.1.0.tar.gz.
File metadata
- Download URL: llmcalibre-0.1.0.tar.gz
- Upload date:
- Size: 22.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc4fe0fbdcdf5bed1e7c060547495f3b885d6cf4e9eaaa0e3df503bea1ae0b2c
|
|
| MD5 |
5aaed2398763b5a0a903ed5529a08ccd
|
|
| BLAKE2b-256 |
7a1c9985e7beb81a8f56547b100280e8d0d80feae700e682ca1a5c9e4c579916
|
Provenance
The following attestation bundles were made for llmcalibre-0.1.0.tar.gz:
Publisher:
publish.yml on kmemonahmed/llmcalibre
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmcalibre-0.1.0.tar.gz -
Subject digest:
fc4fe0fbdcdf5bed1e7c060547495f3b885d6cf4e9eaaa0e3df503bea1ae0b2c - Sigstore transparency entry: 1837165921
- Sigstore integration time:
-
Permalink:
kmemonahmed/llmcalibre@8b75de28941b19f220b91d162961b2f83b67df0b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kmemonahmed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8b75de28941b19f220b91d162961b2f83b67df0b -
Trigger Event:
release
-
Statement type:
File details
Details for the file llmcalibre-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llmcalibre-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a156dfb49daf2f34ebd48c3f3a79008ccf06a7833577eacb548b3bfcee7e9ac6
|
|
| MD5 |
028a0640258884817e3493da34144f0a
|
|
| BLAKE2b-256 |
3a2acb22f7925c19e6a4ac4f06c671020cd8fb2b2eeecf3586df5e6f0dbed75c
|
Provenance
The following attestation bundles were made for llmcalibre-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on kmemonahmed/llmcalibre
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmcalibre-0.1.0-py3-none-any.whl -
Subject digest:
a156dfb49daf2f34ebd48c3f3a79008ccf06a7833577eacb548b3bfcee7e9ac6 - Sigstore transparency entry: 1837166049
- Sigstore integration time:
-
Permalink:
kmemonahmed/llmcalibre@8b75de28941b19f220b91d162961b2f83b67df0b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kmemonahmed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8b75de28941b19f220b91d162961b2f83b67df0b -
Trigger Event:
release
-
Statement type: