Skip to main content

pytest-native semantic assertions for LLM and generative AI applications. No servers. No SaaS. Works with OpenAI, Anthropic, LiteLLM and any LLM client.

Project description

llmassert Banner

semanticheck

pytest-native semantic assertions for LLM and generative AI applications.

No servers. No SaaS. No config. Works with OpenAI, Anthropic, LiteLLM, or any LLM client.

PyPI Python License: MIT CI


Installation

pip install semanticheck

With local embeddings (recommended, no API cost):

pip install "semanticheck[local]"

With OpenAI embeddings:

pip install "semanticheck[openai]"

Quick Start

from semanticheck import assert_intent, assert_tone, assert_no_hallucination, assert_no_pii

def test_customer_support_reply():
    response = my_llm("Help me reset my password")

    assert_intent(response, "instructions for resetting a password")
    assert_tone(response, "friendly")
    assert_no_pii(response)
    assert_no_hallucination(response, known_facts=["Password reset links expire after 24 hours"])

Assertions

Function What it checks
assert_intent(response, expected_intent) Semantic match to expected meaning
assert_tone(response, tone) Tone: professional, casual, friendly, formal, etc.
assert_no_hallucination(response, known_facts) No contradiction of known facts
assert_similar_to(response, reference) Cosine similarity to a reference text
assert_token_budget(response, max_tokens) Response stays within token limit
assert_schema(response, schema) Valid JSON Schema or Pydantic model
assert_language(response, language) Written in expected language
assert_no_pii(response) No emails, SSNs, credit cards, phones, etc.
assert_readability(response, min_score=60) Flesch Reading Ease score
assert_sentiment(response, "positive") Sentiment polarity

Baseline Regression Testing

import pytest
from semanticheck import record_baseline, compare_baseline

@pytest.mark.llm
def test_summarizer_regression(llm_record):
    response = my_summarizer(article)
    if llm_record:
        record_baseline("summarizer_v1", response)
    else:
        compare_baseline("summarizer_v1", response, threshold=0.85)

Run with:

pytest --record-baselines                 # record golden baselines
pytest                                    # compare against them
pytest --baseline-dir ./.baselines/llm    # choose baseline directory (recommended for monorepos)

Recommended ergonomic version (auto baseline naming via nodeid):

import pytest

@pytest.mark.llm
def test_summarizer_regression(llm_baseline):
    response = my_summarizer(article)
    llm_baseline.check(response, threshold=0.85)

LocalJudge

Use a local model as a judge — zero API cost in CI:

from semanticheck import LocalJudge

judge = LocalJudge()  # uses Qwen2.5-0.5B by default
result = judge.evaluate(
    response="Paris is the capital of France.",
    criterion="Correctly answers a geography question about European capitals.",
)
assert result.passed
print(result.score, result.reasoning)

pytest Plugin

semanticheck auto-registers as a pytest plugin. Available flags:

pytest --skip-llm           # skip all @pytest.mark.llm tests
pytest --llm-threshold 0.8  # override similarity threshold globally
pytest --record-baselines   # record golden baselines

Available markers:

@pytest.mark.llm        # LLM semantic test
@pytest.mark.llm_slow   # slow test using local judge inference

Embedding Backends

Backend How to activate Cost
sentence-transformers pip install semanticheck[local] Free
OpenAI Set OPENAI_API_KEY Per token
Hash fallback SEMANTICHECK_EMBED_BACKEND=fallback Free (smoke tests only)

Override via env vars:

SEMANTICHECK_EMBED_BACKEND=openai
SEMANTICHECK_EMBED_MODEL=text-embedding-3-large

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semanticheck-0.2.6.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semanticheck-0.2.6-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file semanticheck-0.2.6.tar.gz.

File metadata

  • Download URL: semanticheck-0.2.6.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for semanticheck-0.2.6.tar.gz
Algorithm Hash digest
SHA256 ffd85e145c130d1e451adef85750afb830d93d1cb2515cdddf3ffbd4d8b23d2e
MD5 9cbab05e7b31da2071b7382ca934250b
BLAKE2b-256 d3feade7e6e797a97f80a156fdb56efba147a1206050f4364fb75b776e1256bf

See more details on using hashes here.

File details

Details for the file semanticheck-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: semanticheck-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for semanticheck-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ab313cf102f79154e2f7296608f744b5cc50f1722c571b6ee1f785a5313c43ef
MD5 021adb52980655a54493c67dcf07f0ee
BLAKE2b-256 1cbca1fb6dece62e7dfaa5723570988542d4d18139cfb3b088fb7072d2fd45bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page