Semantic Unit Testing
Project description
Semantic Unit Testing
What's semantic unit testing?
Semantic unit testing is a testing approach that evaluates whether a function's implementation aligns with its documented behavior. The code is analyzed using LLMs to assess whether the implementation matches the expected behavior described in the docstring.
Here's an example of how to use it
from suite import suite
tester = suite(model_name="openai/o3-mini")
def multiply(x: int, y: int):
"""Multiplies x by y
Args:
x (int): value
y (int): value
"""
return x + y
result = tester(multiply)
print(result)
# {'reasoning': "The function's docstring states that it should multiply x by y.
# However, the implementation returns x + y, which is addition instead of multiplication.
# Therefore, the implementation does not correctly fulfill what is described in the docstring.",
# 'passed': False}
In this example, the implementation of multiply contains an error (it uses addition instead of multiplication). When the tester is called with the multiply function, it evaluates the implementation against the docstring, providing feedback on any discrepancies. This process helps ensure that the function behaves as expected and adheres to its documentation.
Why?
- Comprehensive Coverage: Traditional unit testing focuses on specific inputs and outputs, covering only a small surface of the code.
suite, on the other hand, evaluates the semantic correctness of functions by analyzing their implementation against their documentation. - No need to write tests by hand: Writing tests by hand can be tiring and non-exhaustive. By using LLMs, we can avoid having to write specific examples one by one. This not only saves time but also ensures that a wider range of scenarios and edge cases are considered, leading to more robust testing outcomes.
- Enhanced Reasoning with LLMs: By passing code and context to LLMs, Suite enables a deeper level of reasoning about the function's behavior. This capability allows for more nuanced evaluations.
How?
This library uses llm package by Simon Willison. When testing a method, its source code, docstring, and the dependencies information (any other method used by the code under testing) are retrieved and passed to an LLM for evaluation. Then, the LLM decides if the evaluation is correct or not.
Since we're using llm library we can use any supported model. From my experience, reasoning models that support structured outputs are the ones that work the best (eg: o3-mini).
Usage
To use the suite module, you can create an instance of the suite or async_suite class, depending on your needs. You will then pass the function you want to test, and suite will evaluate its implementation against its docstring, providing feedback on any discrepancies.
You have a couple of examples in the examples folder.
The intended usage of this package is for testing, so you could do something like
# tests/test_multiply.py
from package import multiply
from suite import suite
tester = suite(model="openai/o3-mini")
def test_multiply():
assert tester(multiply)
Since suite also supports async operations you can use pytest-asyncio to speed up your tests (you don't need to run them sequentially since the bottlenck is not your laptop but the LLM provider).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file suite-1.0.1.tar.gz.
File metadata
- Download URL: suite-1.0.1.tar.gz
- Upload date:
- Size: 9.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44df40f94b4d0414b5a783e7f4147abd3a60997a3cc1e8a24af798cf6cea764e
|
|
| MD5 |
b9180d60309e9d9bd328330dd1c05948
|
|
| BLAKE2b-256 |
1252bc841706d3aafa99e936d793ee9605c07c2bad0c6324a5a3db8542ed70af
|
Provenance
The following attestation bundles were made for suite-1.0.1.tar.gz:
Publisher:
python-publish.yml on alexmolas/suite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
suite-1.0.1.tar.gz -
Subject digest:
44df40f94b4d0414b5a783e7f4147abd3a60997a3cc1e8a24af798cf6cea764e - Sigstore transparency entry: 195177111
- Sigstore integration time:
-
Permalink:
alexmolas/suite@c461835c6f575d68972c922e347a6ede89e33e3c -
Branch / Tag:
refs/tags/1.0.1 - Owner: https://github.com/alexmolas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c461835c6f575d68972c922e347a6ede89e33e3c -
Trigger Event:
release
-
Statement type:
File details
Details for the file suite-1.0.1-py3-none-any.whl.
File metadata
- Download URL: suite-1.0.1-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0121a7846bbd6a2123d7484e1fe102aedd3e0b44e25eed30cf2182c9ed037f6
|
|
| MD5 |
20b924e56e5783bde17cb63c5f1d6559
|
|
| BLAKE2b-256 |
879e5979421674acee11d6443fc624f33ed3b0c16df999c66d50fb5d869eb2b4
|
Provenance
The following attestation bundles were made for suite-1.0.1-py3-none-any.whl:
Publisher:
python-publish.yml on alexmolas/suite
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
suite-1.0.1-py3-none-any.whl -
Subject digest:
e0121a7846bbd6a2123d7484e1fe102aedd3e0b44e25eed30cf2182c9ed037f6 - Sigstore transparency entry: 195177113
- Sigstore integration time:
-
Permalink:
alexmolas/suite@c461835c6f575d68972c922e347a6ede89e33e3c -
Branch / Tag:
refs/tags/1.0.1 - Owner: https://github.com/alexmolas
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@c461835c6f575d68972c922e347a6ede89e33e3c -
Trigger Event:
release
-
Statement type: