pytest plugin for semantic LLM output testing — validate meaning, not just shape.
Project description
pytest-semantix
Semantic LLM output testing for pytest. Validate that your LLM outputs mean the right thing — not just that they match a string.
pip install pytest-semantix
Usage
The assert_semantic fixture
def test_chatbot_is_polite(assert_semantic):
response = my_chatbot("handle angry customer")
assert_semantic(response, "polite and professional")
Runs locally on CPU in ~15ms. No API key. Works with any LLM.
On failure:
AssertionError: Semantic check failed (score=0.12)
Intent: polite and professional
Output: "You're an idiot for asking that."
Reason: Text contains aggressive language
Markers
Use @pytest.mark.semantic to attach an intent to a test:
import pytest
@pytest.mark.semantic("polite and professional")
def test_with_marker(assert_semantic):
response = my_chatbot("handle angry customer")
assert_semantic(response) # intent comes from the marker
Intent classes
Reuse intents across tests:
from semantix import Intent
class Polite(Intent):
"""The text must be polite and professional."""
def test_polite(assert_semantic):
assert_semantic(my_chatbot("hello"), Polite)
Negation
Test that outputs do NOT match an intent:
from semantix import Intent
class MedicalAdvice(Intent):
"""The text provides medical diagnoses or treatment recommendations."""
def test_no_medical_advice(assert_semantic):
assert_semantic(my_chatbot("my head hurts"), ~MedicalAdvice)
CLI Options
--semantic-report Print a summary of all semantic assertions
--semantic-report-json=PATH Write results to a JSON file
--semantic-threshold=FLOAT Global default threshold (0.0-1.0)
Report example
$ pytest --semantic-report
======================== semantic assertion report =========================
Total: 5 | Passed: 4 | Failed: 1
[PASS] tests/test_bot.py::test_polite [12ms]
[PASS] tests/test_bot.py::test_helpful [14ms]
[FAIL] tests/test_bot.py::test_no_pii (score=0.67) Contains email address [11ms]
[PASS] tests/test_bot.py::test_on_topic [13ms]
[PASS] tests/test_bot.py::test_concise [15ms]
============================================================================
JSON report
$ pytest --semantic-report-json=semantic-results.json
{
"summary": { "total": 5, "passed": 4, "failed": 1 },
"results": [
{
"nodeid": "tests/test_bot.py::test_polite",
"intent": "polite and professional",
"passed": true,
"score": null,
"reason": "",
"duration_ms": 12.3
}
]
}
How it works
pytest-semantix wraps semantix-ai's assert_semantic() function as a pytest fixture. Under the hood, it uses a local NLI (Natural Language Inference) model to check whether your LLM output entails the given intent. No network calls, no API keys, no tokens burned.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytest_semantix-0.1.0.tar.gz.
File metadata
- Download URL: pytest_semantix-0.1.0.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc5f99ea844d96b022079d8e4f6fc1e3afa11a93f3a9acc54ec40e5b7863685a
|
|
| MD5 |
4e3912b0b60d9d3429dfeca290364b07
|
|
| BLAKE2b-256 |
5cece18c71a5fff27481c37cad03e427ce95eb71cdb1f4ec05e87704e8e75de9
|
File details
Details for the file pytest_semantix-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pytest_semantix-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a04538616ef44fb26f6db1d9aba442c307363b7bd52581708ec5be3ce10f10a
|
|
| MD5 |
bfccaa6fa95eb3df55b3986c0e67300e
|
|
| BLAKE2b-256 |
c3ff524f802367f28e69b1b20d2b2f400c890aba94fdf38f3273f2053f28f0ce
|