Skip to main content

pytest plugin for semantic LLM output testing — validate meaning, not just shape.

Project description

pytest-semantix

Semantic LLM output testing for pytest. Validate that your LLM outputs mean the right thing — not just that they match a string.

pip install pytest-semantix

Usage

The assert_semantic fixture

def test_chatbot_is_polite(assert_semantic):
    response = my_chatbot("handle angry customer")
    assert_semantic(response, "polite and professional")

Runs locally on CPU in ~15ms. No API key. Works with any LLM.

On failure:

AssertionError: Semantic check failed (score=0.12)
  Intent:  polite and professional
  Output:  "You're an idiot for asking that."
  Reason:  Text contains aggressive language

Markers

Use @pytest.mark.semantic to attach an intent to a test:

import pytest

@pytest.mark.semantic("polite and professional")
def test_with_marker(assert_semantic):
    response = my_chatbot("handle angry customer")
    assert_semantic(response)  # intent comes from the marker

Intent classes

Reuse intents across tests:

from semantix import Intent

class Polite(Intent):
    """The text must be polite and professional."""

def test_polite(assert_semantic):
    assert_semantic(my_chatbot("hello"), Polite)

Negation

Test that outputs do NOT match an intent:

from semantix import Intent

class MedicalAdvice(Intent):
    """The text provides medical diagnoses or treatment recommendations."""

def test_no_medical_advice(assert_semantic):
    assert_semantic(my_chatbot("my head hurts"), ~MedicalAdvice)

CLI Options

--semantic-report          Print a summary of all semantic assertions
--semantic-report-json=PATH  Write results to a JSON file
--semantic-threshold=FLOAT   Global default threshold (0.0-1.0)

Report example

$ pytest --semantic-report

======================== semantic assertion report =========================
  Total: 5  |  Passed: 4  |  Failed: 1

  [PASS] tests/test_bot.py::test_polite  [12ms]
  [PASS] tests/test_bot.py::test_helpful  [14ms]
  [FAIL] tests/test_bot.py::test_no_pii  (score=0.67)  Contains email address  [11ms]
  [PASS] tests/test_bot.py::test_on_topic  [13ms]
  [PASS] tests/test_bot.py::test_concise  [15ms]

============================================================================

JSON report

$ pytest --semantic-report-json=semantic-results.json
{
  "summary": { "total": 5, "passed": 4, "failed": 1 },
  "results": [
    {
      "nodeid": "tests/test_bot.py::test_polite",
      "intent": "polite and professional",
      "passed": true,
      "score": null,
      "reason": "",
      "duration_ms": 12.3
    }
  ]
}

How it works

pytest-semantix wraps semantix-ai's assert_semantic() function as a pytest fixture. Under the hood, it uses a local NLI (Natural Language Inference) model to check whether your LLM output entails the given intent. No network calls, no API keys, no tokens burned.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_semantix-0.1.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pytest_semantix-0.1.0-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file pytest_semantix-0.1.0.tar.gz.

File metadata

  • Download URL: pytest_semantix-0.1.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pytest_semantix-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fc5f99ea844d96b022079d8e4f6fc1e3afa11a93f3a9acc54ec40e5b7863685a
MD5 4e3912b0b60d9d3429dfeca290364b07
BLAKE2b-256 5cece18c71a5fff27481c37cad03e427ce95eb71cdb1f4ec05e87704e8e75de9

See more details on using hashes here.

File details

Details for the file pytest_semantix-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pytest_semantix-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3a04538616ef44fb26f6db1d9aba442c307363b7bd52581708ec5be3ce10f10a
MD5 bfccaa6fa95eb3df55b3986c0e67300e
BLAKE2b-256 c3ff524f802367f28e69b1b20d2b2f400c890aba94fdf38f3273f2053f28f0ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page