Skip to main content

pytest-native semantic assertions for LLM and generative AI applications. No servers. No SaaS. Works with OpenAI, Anthropic, LiteLLM and any LLM client.

Project description

semanticheck

pytest-native semantic assertions for LLM and generative AI applications.

No servers. No SaaS. No config. Works with OpenAI, Anthropic, LiteLLM, or any LLM client.

PyPI Python License: MIT CI


Installation

pip install semanticheck

With local embeddings (recommended, no API cost):

pip install "semanticheck[local]"

With OpenAI embeddings:

pip install "semanticheck[openai]"

Quick Start

from semanticheck import assert_intent, assert_tone, assert_no_hallucination, assert_no_pii

def test_customer_support_reply():
    response = my_llm("Help me reset my password")

    assert_intent(response, "instructions for resetting a password")
    assert_tone(response, "friendly")
    assert_no_pii(response)
    assert_no_hallucination(response, known_facts=["Password reset links expire after 24 hours"])

Assertions

Function What it checks
assert_intent(response, expected_intent) Semantic match to expected meaning
assert_tone(response, tone) Tone: professional, casual, friendly, formal, etc.
assert_no_hallucination(response, known_facts) No contradiction of known facts
assert_similar_to(response, reference) Cosine similarity to a reference text
assert_token_budget(response, max_tokens) Response stays within token limit
assert_schema(response, schema) Valid JSON Schema or Pydantic model
assert_language(response, language) Written in expected language
assert_no_pii(response) No emails, SSNs, credit cards, phones, etc.
assert_readability(response, min_score=60) Flesch Reading Ease score
assert_sentiment(response, "positive") Sentiment polarity

Baseline Regression Testing

from semanticheck import record_baseline, compare_baseline

def test_summarizer_regression(llm_record):
    response = my_summarizer(article)
    if llm_record:
        record_baseline("summarizer_v1", response)
    else:
        compare_baseline("summarizer_v1", response, threshold=0.85)

Run with:

pytest --record-baselines   # record golden baselines
pytest                      # compare against them

LocalJudge

Use a local model as a judge — zero API cost in CI:

from semanticheck import LocalJudge

judge = LocalJudge()  # uses Qwen2.5-0.5B by default
result = judge.evaluate(
    response="Paris is the capital of France.",
    criterion="Correctly answers a geography question about European capitals.",
)
assert result.passed
print(result.score, result.reasoning)

pytest Plugin

semanticheck auto-registers as a pytest plugin. Available flags:

pytest --skip-llm           # skip all @pytest.mark.llm tests
pytest --llm-threshold 0.8  # override similarity threshold globally
pytest --record-baselines   # record golden baselines

Available markers:

@pytest.mark.llm        # LLM semantic test
@pytest.mark.llm_slow   # slow test using local judge inference

Embedding Backends

Backend How to activate Cost
sentence-transformers pip install semanticheck[local] Free
OpenAI Set OPENAI_API_KEY Per token
Hash fallback LLMASSERT_EMBED_BACKEND=fallback Free (smoke tests only)

Override via env vars:

LLMASSERT_EMBED_BACKEND=openai
LLMASSERT_EMBED_MODEL=text-embedding-3-large

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semanticheck-0.1.0.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semanticheck-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file semanticheck-0.1.0.tar.gz.

File metadata

  • Download URL: semanticheck-0.1.0.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for semanticheck-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bb14a1f694d21e83bb7cbb95777618a85d69bca358310bd8bd449320356a89ef
MD5 33c78cd97658d62aeadca9e2e720694e
BLAKE2b-256 c015645dad33657a72e19bd4e3cd1c1eaa592829bca06e6411d3a228e528b448

See more details on using hashes here.

File details

Details for the file semanticheck-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: semanticheck-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for semanticheck-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ad3724bc3090bb18350e54824b43105f097d71d6995b7c46840d55bbd413eaa
MD5 4d2a55207107103cf20be14c73caa9a4
BLAKE2b-256 ac23b82e791e28852f8c51bd40713e130bb7cd30c665a3737fbfae078776d5e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page