pytest-texts-score

Texts content similarity scoring plugin

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

vodilpat

These details have not been verified by PyPI

Project description

A pytest plugin for semantic text similarity scoring using Large Language Models (LLMs).

It enables robust assertions over meaning, not surface text, making it ideal for validating LLM outputs, RAG systems, summaries, and other generated content.

The plugin evaluates similarity by prompting an LLM to extract and answer factual questions, producing Precision (Completeness), Recall (Correctness), and F1 scores.

Features

✔ Semantic comparison beyond keyword matching
✔ Standard IR metrics: F1, Precision, Recall
✔ Azure OpenAI support via pytest configuration
✔ Readable aliases: completeness ↔ precision, correctness ↔ recall
✔ CI-friendly aggregation to reduce LLM variance

Requirements

Python >=3.10,<4.0
pytest >=8.4.2
Azure OpenAI subscription with a deployed model (e.g., GPT-4)

Installation

Install from PyPI:

pip install pytest-texts-score

Configuration

Configuration is provided via pytest.ini or overridden with CLI arguments.

Required settings

llm-api-key — Azure OpenAI API key
llm-endpoint — Azure OpenAI resource endpoint
llm-api-version — API version (e.g. 2024-05-01)
llm-deployment — Deployment name
llm-model — Model identifier (e.g. gpt-4)

Optional settings

llm-max-tokens — Maximum response tokens (default: 8192)

Example pytest.ini

[pytest]
llm_api_key = YOUR_API_KEY
llm_endpoint = https://your-resource.openai.azure.com/
llm_api_version = 2024-05-01
llm_deployment = your-deployment
llm_model = gpt-4
llm_max_tokens = 8192

Override any value via CLI:

pytest --llm-temperature=0.5

Usage

You can use the plugin either by direct imports or via the ``texts_score`` fixture.

Direct import

from pytest_texts_score import texts_expect_f1_equal

def test_similarity():
    expected = "The quick brown fox jumps over a dog."
    actual = "A fast brown fox leaps over a dog."

   exts_expect_f1_equal(expected, actual, 1.0)

Fixture-based usage

The texts_score fixture exposes all assertion helpers in a dictionary.

def test_similarity(texts_score):
    expected = "The quick brown fox jumps over a dog."
    actual = "A fast brown fox leaps over a dog."

   texts_score["expect_f1_equal"](expected, actual, 1.0)

Documentation

Documentation is availbe at documentation

Available Assertions

Metrics overview

Recall (Correctness) Measures how much information from the expected text is present in the given text.
Precision (Completeness) Measures how much information in the given text is supported by the expected text.
F1 score Harmonic mean of precision and recall.

Single-run assertions

These execute one LLM evaluation. *_equal variants are convenience wrappers around *_range.

▶ F1 score

texts_expect_f1_equal
texts_expect_f1_range

▶ Precision / Completeness

texts_expect_precision_equal
texts_expect_precision_range
texts_expect_completeness_equal (alias)
texts_expect_completeness_range (alias)

▶ Recall / Correctness

texts_expect_recall_equal
texts_expect_recall_range
texts_expect_correctness_equal (alias)
texts_expect_correctness_range (alias)

Aggregated assertions

These perform multiple evaluations and aggregate the result. Recommended for CI/CD pipelines to reduce LLM nondeterminism.

Supported aggregations: min, max, median, mean / average.

▶ F1 score

texts_agg_f1_min
texts_agg_f1_max
texts_agg_f1_median
texts_agg_f1_mean
texts_agg_f1_average

▶ Precision / Completeness

texts_agg_precision_min
texts_agg_precision_max
texts_agg_precision_median
texts_agg_precision_mean
texts_agg_precision_average
texts_agg_completeness_min
texts_agg_completeness_max
texts_agg_completeness_median
texts_agg_completeness_mean
texts_agg_completeness_average

▶ Recall / Correctness

texts_agg_recall_min
texts_agg_recall_max
texts_agg_recall_median
texts_agg_recall_mean
texts_agg_recall_average
texts_agg_correctness_min
texts_agg_correctness_max
texts_agg_correctness_median
texts_agg_correctness_mean
texts_agg_correctness_average

License

Distributed under the terms of the MIT license.

Issues & Support

Please report bugs or feature requests via the GitHub issue tracker: file an issue

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

vodilpat

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.1

Dec 17, 2025

This version

1.0.0

Dec 13, 2025

0.2.1

Nov 26, 2025

0.2.0

Nov 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytest_texts_score-1.0.0.tar.gz (27.5 kB view details)

Uploaded Dec 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytest_texts_score-1.0.0-py3-none-any.whl (23.5 kB view details)

Uploaded Dec 13, 2025 Python 3

File details

Details for the file pytest_texts_score-1.0.0.tar.gz.

File metadata

Download URL: pytest_texts_score-1.0.0.tar.gz
Upload date: Dec 13, 2025
Size: 27.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pytest_texts_score-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`51fc7a9f2d8fcc25be53fbfbf6f2f1345695d2c5a1a19ed0356f0baa9cda4baa`
MD5	`bc96006cb311fcd2f8ad4361c5ee6035`
BLAKE2b-256	`c04586db9e1f2bfdb41f7d33342e1cb843079ed765d63538690ec94803d0da81`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_texts_score-1.0.0.tar.gz:

Publisher: pypi.yml on VodilaPat/pytest-texts-score

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pytest_texts_score-1.0.0.tar.gz
- Subject digest: 51fc7a9f2d8fcc25be53fbfbf6f2f1345695d2c5a1a19ed0356f0baa9cda4baa
- Sigstore transparency entry: 763529654
- Sigstore integration time: Dec 13, 2025
Source repository:
- Permalink: VodilaPat/pytest-texts-score@971593a83d70c57adc05b7ecdd8df2655a6fedbb
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/VodilaPat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@971593a83d70c57adc05b7ecdd8df2655a6fedbb
- Trigger Event: push

File details

Details for the file pytest_texts_score-1.0.0-py3-none-any.whl.

File metadata

Download URL: pytest_texts_score-1.0.0-py3-none-any.whl
Upload date: Dec 13, 2025
Size: 23.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pytest_texts_score-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b7bc40bcc48611d0fce00330d0480beefa2adc9e7fb7dc504807968c75d684b9`
MD5	`ee0a44470b6bf1efce0dc1afdabebaa1`
BLAKE2b-256	`0b3316f0637c1a7c788ca2f900989ebeaf51378a05abec077f66997332ad09b3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pytest_texts_score-1.0.0-py3-none-any.whl:

Publisher: pypi.yml on VodilaPat/pytest-texts-score

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pytest_texts_score-1.0.0-py3-none-any.whl
- Subject digest: b7bc40bcc48611d0fce00330d0480beefa2adc9e7fb7dc504807968c75d684b9
- Sigstore transparency entry: 763529657
- Sigstore integration time: Dec 13, 2025
Source repository:
- Permalink: VodilaPat/pytest-texts-score@971593a83d70c57adc05b7ecdd8df2655a6fedbb
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/VodilaPat
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@971593a83d70c57adc05b7ecdd8df2655a6fedbb
- Trigger Event: push

pytest-texts-score 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Features

Requirements

Installation

Configuration

Required settings

Optional settings

Example pytest.ini

Usage

Direct import

Fixture-based usage

Documentation

Available Assertions

Metrics overview

Single-run assertions

▶ F1 score

▶ Precision / Completeness

▶ Recall / Correctness

Aggregated assertions

▶ F1 score

▶ Precision / Completeness

▶ Recall / Correctness

License

Issues & Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance