Automated Summary Scoring & Evaluation of Retained Text
Project description
ASSERT LLM Tools
⚠️ Deprecated — this package is no longer maintained.
assert_llm_toolshas been superseded by focused, independently-versioned packages:
Capability New package Install Summary evaluation assert-eval pip install assert-evalCompliance note evaluation assert-review pip install assert-reviewVersion 1.0.0 is the final release. No further updates will be made. Please migrate to the packages above.
Automated Summary Scoring & Evaluation of Retained Text
ASSERT LLM Tools is a lightweight Python library for LLM-based text evaluation. It provides two main capabilities:
- Summary evaluation — score a summary against source text for coverage, factual accuracy, coherence, and more
- Compliance note evaluation — evaluate adviser meeting notes against regulatory frameworks (FCA, MiFID II) and return a structured gap report
All evaluation is LLM-based. No PyTorch, no BERT, no heavy dependencies.
Installation
pip install assert-llm-tools
Quick Start
Summary Evaluation
from assert_llm_tools import evaluate_summary, LLMConfig
config = LLMConfig(
provider="bedrock",
model_id="us.amazon.nova-pro-v1:0",
region="us-east-1",
)
results = evaluate_summary(
full_text="Original long text goes here...",
summary="Summary to evaluate goes here...",
metrics=["coverage", "factual_consistency", "coherence"],
llm_config=config,
)
print(results)
# {'coverage': 0.85, 'factual_consistency': 0.92, 'coherence': 0.88}
Compliance Note Evaluation
from assert_llm_tools import evaluate_note, LLMConfig
config = LLMConfig(
provider="bedrock",
model_id="us.amazon.nova-pro-v1:0",
region="us-east-1",
)
report = evaluate_note(
note_text="Client meeting note text goes here...",
framework="fca_suitability_v1",
llm_config=config,
)
print(report.overall_rating) # "Compliant" / "Minor Gaps" / "Requires Attention" / "Non-Compliant"
print(report.overall_score) # 0.0–1.0
print(report.passed) # True / False
for item in report.items:
print(f"{item.element_id}: {item.status} (score: {item.score:.2f})")
if item.suggestions:
for s in item.suggestions:
print(f" → {s}")
Summary Evaluation
Available Metrics
| Metric | Description |
|---|---|
coverage |
How completely the summary captures claims from the source text |
factual_consistency |
Whether claims in the summary are supported by the source |
factual_alignment |
Combined coverage + consistency score |
topic_preservation |
How well the summary preserves the main topics |
conciseness |
Information density — does the summary avoid padding? |
redundancy |
Detects repetitive content within the summary |
coherence |
Logical flow and readability of the summary |
Deprecated names (still accepted for backwards compatibility):
faithfulness→ usecoverage;hallucination→ usefactual_consistency.
Custom Evaluation Instructions
Tailor LLM evaluation criteria for your domain:
results = evaluate_summary(
full_text=text,
summary=summary,
metrics=["coverage", "factual_consistency"],
llm_config=config,
custom_prompt_instructions={
"coverage": "Apply strict standards. Only mark a claim as covered if it is clearly and explicitly represented.",
"factual_consistency": "Flag any claim that adds detail not present in the original text.",
},
)
Verbose Output
Pass verbose=True to include per-claim reasoning in the results:
results = evaluate_summary(..., verbose=True)
Compliance Note Evaluation
⚠️ Experimental — do not use in live or production systems.
evaluate_note()is under active development. Outputs are non-deterministic (LLM-based), the API may change between releases, and results have not been validated against real regulatory decisions. This feature is intended for research, prototyping, and internal tooling only. It is not a substitute for qualified compliance review and must not be used to make or support live regulatory or client-facing decisions.
evaluate_note()
from assert_llm_tools import evaluate_note, LLMConfig
from assert_llm_tools.metrics.note.models import PassPolicy
report = evaluate_note(
note_text=note,
framework="fca_suitability_v1", # built-in ID or path to a custom YAML
llm_config=config,
mask_pii=False, # mask client PII before sending to LLM
verbose=False, # include LLM reasoning in GapItem.notes
custom_instruction=None, # additional instruction appended to all element prompts
pass_policy=None, # custom PassPolicy (see below)
metadata={"note_id": "N-001"}, # arbitrary key/value pairs, passed through to GapReport
)
GapReport
| Field | Type | Description |
|---|---|---|
framework_id |
str |
Framework used for evaluation |
framework_version |
str |
Framework version |
passed |
bool |
Whether the note passes the framework's policy thresholds |
overall_score |
float |
Weighted mean element score, 0.0–1.0 |
overall_rating |
str |
Human-readable compliance rating (see below) |
items |
List[GapItem] |
Per-element evaluation results |
summary |
str |
LLM-generated narrative summary of the evaluation |
stats |
GapReportStats |
Counts by status and severity |
pii_masked |
bool |
Whether PII masking was applied |
metadata |
dict |
Caller-supplied metadata, passed through unchanged |
Overall rating values:
| Rating | Meaning |
|---|---|
Compliant |
Passed — all elements fully present |
Minor Gaps |
Passed — but some elements are partial or optional elements missing |
Requires Attention |
Failed — high/medium gaps, no critical blockers |
Non-Compliant |
Failed — one or more critical required elements missing or below threshold |
GapItem
| Field | Type | Description |
|---|---|---|
element_id |
str |
Element identifier from the framework |
status |
str |
"present", "partial", or "missing" |
score |
float |
0.0–1.0 quality score for this element |
evidence |
Optional[str] |
Quote or paraphrase from the note supporting the assessment. None when element is missing. |
severity |
str |
"critical", "high", "medium", or "low" |
required |
bool |
Whether this element is required by the framework |
suggestions |
List[str] |
Actionable remediation suggestions for gaps (empty when status == "present") |
notes |
Optional[str] |
LLM reasoning (only populated when verbose=True) |
Built-in Frameworks
| Framework ID | Description |
|---|---|
fca_suitability_v1 |
FCA suitability note requirements under COBS 9.2 / PS13/1 (9 elements) |
Custom Frameworks
Pass a path to your own YAML file:
report = evaluate_note(
note_text=note,
framework="/path/to/my_framework.yaml",
llm_config=config,
)
The YAML schema mirrors the built-in frameworks. See assert_llm_tools/frameworks/fca_suitability_v1.yaml for a reference example.
Configurable Pass Policy
from assert_llm_tools.metrics.note.models import PassPolicy
policy = PassPolicy(
critical_partial_threshold=0.5, # partial critical element treated as blocker if score < this
required_pass_threshold=0.6, # required element must score >= this to pass
score_correction_missing_cutoff=0.2,
score_correction_present_min=0.5,
score_correction_present_floor=0.7,
)
report = evaluate_note(note_text=note, framework="fca_suitability_v1", pass_policy=policy, llm_config=config)
LLM Configuration
from assert_llm_tools import LLMConfig
# AWS Bedrock
config = LLMConfig(
provider="bedrock",
model_id="us.amazon.nova-pro-v1:0",
region="us-east-1",
# api_key / api_secret / aws_session_token for explicit credentials (optional — uses ~/.aws by default)
)
# OpenAI
config = LLMConfig(
provider="openai",
model_id="gpt-4o",
api_key="your-openai-api-key",
)
Supported Bedrock Model Families
| Model Family | Example Model IDs |
|---|---|
| Amazon Nova | us.amazon.nova-pro-v1:0, amazon.nova-lite-v1:0 |
| Anthropic Claude | anthropic.claude-3-sonnet-20240229-v1:0 |
| Meta Llama | meta.llama3-70b-instruct-v1:0 |
| Mistral AI | mistral.mistral-large-2402-v1:0 |
| Cohere Command | cohere.command-r-plus-v1:0 |
| AI21 Labs | ai21.jamba-1-5-large-v1:0 |
Proxy Configuration
# Single proxy
config = LLMConfig(provider="bedrock", model_id="...", region="us-east-1",
proxy_url="http://proxy.example.com:8080")
# Protocol-specific
config = LLMConfig(provider="bedrock", model_id="...", region="us-east-1",
http_proxy="http://proxy.example.com:8080",
https_proxy="http://proxy.example.com:8443")
# Authenticated proxy
config = LLMConfig(provider="bedrock", model_id="...", region="us-east-1",
proxy_url="http://username:password@proxy.example.com:8080")
Standard HTTP_PROXY / HTTPS_PROXY environment variables are also respected.
PII Masking
Apply PII detection and masking before any text is sent to the LLM:
# Summary evaluation
results = evaluate_summary(
full_text=text, summary=summary, metrics=["coverage"],
llm_config=config, mask_pii=True,
)
# Note evaluation
report = evaluate_note(note_text=note, framework="fca_suitability_v1",
llm_config=config, mask_pii=True)
Note:
mask_pii=Falseis the default. For production use with real client data, setmask_pii=True. Output files (e.g.--output report.json) may contain verbatim evidence quotes — treat them accordingly.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file assert_llm_tools-1.0.0.tar.gz.
File metadata
- Download URL: assert_llm_tools-1.0.0.tar.gz
- Upload date:
- Size: 49.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cb9e39528d7e3a2402bbfde0efd3f8f3ed39c68fe6fdc377f950a5a24825437
|
|
| MD5 |
bdabd838af13f981efa5de9545fc8ab9
|
|
| BLAKE2b-256 |
c920a3db94ea0993c236b9148719bec353e34a60fd21ce3659b2f8940517a909
|
Provenance
The following attestation bundles were made for assert_llm_tools-1.0.0.tar.gz:
Publisher:
python-publish.yml on charliedouglas/assert_llm_tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assert_llm_tools-1.0.0.tar.gz -
Subject digest:
7cb9e39528d7e3a2402bbfde0efd3f8f3ed39c68fe6fdc377f950a5a24825437 - Sigstore transparency entry: 973229352
- Sigstore integration time:
-
Permalink:
charliedouglas/assert_llm_tools@377761c436018ad2cf56b075375db38e34dc809d -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/charliedouglas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@377761c436018ad2cf56b075375db38e34dc809d -
Trigger Event:
release
-
Statement type:
File details
Details for the file assert_llm_tools-1.0.0-py3-none-any.whl.
File metadata
- Download URL: assert_llm_tools-1.0.0-py3-none-any.whl
- Upload date:
- Size: 60.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b11687e39a3700ce5819bd6cb19d03c0fab78e639fa94146976d18fa1b6e06fb
|
|
| MD5 |
a5ed967058805cc7cd7a4e10480f8814
|
|
| BLAKE2b-256 |
62b0c8a02102ed506c2991dd5c63f0013fcd78293517cc6bb6fbaf6494778fde
|
Provenance
The following attestation bundles were made for assert_llm_tools-1.0.0-py3-none-any.whl:
Publisher:
python-publish.yml on charliedouglas/assert_llm_tools
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
assert_llm_tools-1.0.0-py3-none-any.whl -
Subject digest:
b11687e39a3700ce5819bd6cb19d03c0fab78e639fa94146976d18fa1b6e06fb - Sigstore transparency entry: 973229360
- Sigstore integration time:
-
Permalink:
charliedouglas/assert_llm_tools@377761c436018ad2cf56b075375db38e34dc809d -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/charliedouglas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@377761c436018ad2cf56b075375db38e34dc809d -
Trigger Event:
release
-
Statement type: