Skip to main content

pytest-native semantic assertions for LLM and generative AI applications. No servers. No SaaS. Works with OpenAI, Anthropic, LiteLLM and any LLM client.

Project description

llmassert Banner

semanticheck

pytest-native semantic assertions for LLM and generative AI applications.

No servers. No SaaS. No config. Works with OpenAI, Anthropic, LiteLLM, or any LLM client.

PyPI Python License: MIT CI


Installation

pip install semanticheck

With local embeddings (recommended, no API cost):

pip install "semanticheck[local]"

With OpenAI embeddings:

pip install "semanticheck[openai]"

Quick Start

from semanticheck import assert_intent, assert_tone, assert_no_hallucination, assert_no_pii

def test_customer_support_reply():
    response = my_llm("Help me reset my password")

    assert_intent(response, "instructions for resetting a password")
    assert_tone(response, "friendly")
    assert_no_pii(response)
    assert_no_hallucination(response, known_facts=["Password reset links expire after 24 hours"])

Assertions

Function What it checks
assert_intent(response, expected_intent) Semantic match to expected meaning
assert_tone(response, tone) Tone: professional, casual, friendly, formal, etc.
assert_no_hallucination(response, known_facts) No contradiction of known facts
assert_similar_to(response, reference) Cosine similarity to a reference text
assert_token_budget(response, max_tokens) Response stays within token limit
assert_schema(response, schema) Valid JSON Schema or Pydantic model
assert_language(response, language) Written in expected language
assert_no_pii(response) No emails, SSNs, credit cards, phones, etc.
assert_readability(response, min_score=60) Flesch Reading Ease score
assert_sentiment(response, "positive") Sentiment polarity

Baseline Regression Testing

import pytest
from semanticheck import record_baseline, compare_baseline

@pytest.mark.llm
def test_summarizer_regression(llm_record):
    response = my_summarizer(article)
    if llm_record:
        record_baseline("summarizer_v1", response)
    else:
        compare_baseline("summarizer_v1", response, threshold=0.85)

Run with:

pytest --record-baselines                 # record golden baselines
pytest                                    # compare against them
pytest --baseline-dir ./.baselines/llm    # choose baseline directory (recommended for monorepos)

Recommended ergonomic version (auto baseline naming via nodeid):

import pytest

@pytest.mark.llm
def test_summarizer_regression(llm_baseline):
    response = my_summarizer(article)
    llm_baseline.check(response, threshold=0.85)

LocalJudge

Use a local model as a judge — zero API cost in CI:

from semanticheck import LocalJudge

judge = LocalJudge()  # uses Qwen2.5-0.5B by default
result = judge.evaluate(
    response="Paris is the capital of France.",
    criterion="Correctly answers a geography question about European capitals.",
)
assert result.passed
print(result.score, result.reasoning)

pytest Plugin

semanticheck auto-registers as a pytest plugin. Available flags:

pytest --skip-llm           # skip all @pytest.mark.llm tests
pytest --llm-threshold 0.8  # override similarity threshold globally
pytest --record-baselines   # record golden baselines

Available markers:

@pytest.mark.llm        # LLM semantic test
@pytest.mark.llm_slow   # slow test using local judge inference

Embedding Backends

Backend How to activate Cost
sentence-transformers pip install semanticheck[local] Free
OpenAI Set OPENAI_API_KEY Per token
Hash fallback SEMANTICHECK_EMBED_BACKEND=fallback Free (smoke tests only)

Override via env vars:

SEMANTICHECK_EMBED_BACKEND=openai
SEMANTICHECK_EMBED_MODEL=text-embedding-3-large

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semanticheck-0.2.7.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semanticheck-0.2.7-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file semanticheck-0.2.7.tar.gz.

File metadata

  • Download URL: semanticheck-0.2.7.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for semanticheck-0.2.7.tar.gz
Algorithm Hash digest
SHA256 d6cf99cebaa573b3c1b6c2ff16dcaa769fa4acfb7067dfe95314ff9128c2c847
MD5 14d35ae8a160e58f94d59c7c9d508e00
BLAKE2b-256 ad561581f83826fce9a5525c6895c1f5051989237b11e877a5ccf328fa921ab2

See more details on using hashes here.

File details

Details for the file semanticheck-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: semanticheck-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for semanticheck-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a7ea0ae410f88b0b83bc812f3df4c049f5c0e9750b9c1774264f4fd4956d9123
MD5 13ac233851b8de251fa3029e3f13bc16
BLAKE2b-256 0cb50898b0924d16eb5254be55dad7e0783542d5c836334e0e6e7a412343df6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page