Skip to main content

Evaluation library for span-level entity extraction

Project description

spaneval

Pure evaluation library for span-level entity extraction — takes ground truth and predicted (entity_type, start, end) tuples and returns precision, recall, and F1. Works with any pipeline that produces character-span predictions: LLM extractors, fine-tuned models, rule-based systems. Configurable overlap strategies, per-type precision/recall targets, and a scalar optimization score for automated prompt engineering.

Inspired by and a generalization of nervaluate.

Installation

pip install spaneval

Relationship to seqeval and nervaluate

seqeval works on IOB/BIO token sequences. LLMs output character spans — converting between the two is lossy and inconvenient.

nervaluate is the closest prior work and covers the SemEval 2013 strategies well. This library extends the same foundation with configurable overlap thresholds, per-type strategy assignment, and score() — a scalar optimization target for automated prompt engineering and hyperparameter search.

Quickstart

from spaneval import evaluate, to_entities

true = to_entities([
    {"entity_type": "PERSON", "start":  0, "end": 10},
    {"entity_type": "ORG",    "start": 34, "end": 44},
])
pred = to_entities([
    {"entity_type": "PERSON", "start":  0, "end":  9},  # slightly off boundary
    {"entity_type": "ORG",    "start": 34, "end": 44},  # exact
])

results = evaluate(true, pred)
results.report()

report() with no arguments shows a ± range across two strategies (Strict / AnyOverlap), giving an instant picture of how boundary precision affects your numbers.

For a guided walkthrough, see the examples:

  • examples/quickstart.py — five steps from zero-config to per-type strategy assignment
  • examples/goals.py — two steps from goal definition to automated prompt optimization

Strategies

Two branches, covered in full in docs/:

Entity-count strategies score each entity as correct, incorrect, missed, or spurious:

Strategy Span Type
Strict Exact boundaries Required
Exact Exact boundaries Ignored
EntType Any overlap Required
Partial Any overlap; 0.5 credit if boundaries differ Ignored
ProportionalCoverage Fraction of true-entity characters covered Ignored
AnyOverlap Any overlap = full credit Ignored
Contains Prediction must fully contain the true span Ignored
MinimumOverlap Configurable threshold and overlap metric Configurable

TextCoverage counts characters rather than entities — useful when the goal is text redaction rather than entity classification.

Per-type strategies and goals

Different entity types can be evaluated under different strategies and held to different targets:

from spaneval.strategies import Strict, ProportionalCoverage

# different strictness per type
results.report(strategy={"PERSON": Strict(), "DATE": ProportionalCoverage()})

# precision/recall targets per type
from spaneval import Goal
goals = {
    "PERSON": Goal(strategy=Strict(),               recall=0.90, precision=0.80),
    "DATE":   Goal(strategy=ProportionalCoverage(),  recall=0.80, precision=0.70),
}
results.report_goals(goals)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spaneval-0.2.1.tar.gz (127.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spaneval-0.2.1-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file spaneval-0.2.1.tar.gz.

File metadata

  • Download URL: spaneval-0.2.1.tar.gz
  • Upload date:
  • Size: 127.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spaneval-0.2.1.tar.gz
Algorithm Hash digest
SHA256 d912e6945b6533aeb621ef50f18ad40bd6737c9ef997f01ec8c6021de3ac029c
MD5 9e56f61cb9801c0ef5ca29c0868fc149
BLAKE2b-256 3e04d98aa1c1d99842326989084e11d6046ca36f806bc1115a24e70d5617e93b

See more details on using hashes here.

Provenance

The following attestation bundles were made for spaneval-0.2.1.tar.gz:

Publisher: publish.yml on jamblejoe/spaneval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spaneval-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: spaneval-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spaneval-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ef52e4770f4d65e6197d3ea8362500f27d6af44916ed9d7e65f96bc96fd98fa4
MD5 7cc96e0c05f7143adb798905bfeea6e9
BLAKE2b-256 fdf07782c5123787700663299ff6f3429085694686b874496acb6435ec63fb1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for spaneval-0.2.1-py3-none-any.whl:

Publisher: publish.yml on jamblejoe/spaneval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page