Skip to main content

Evaluation library for span-level entity extraction

Project description

spaneval

Pure evaluation library for span-level entity extraction — takes ground truth and predicted (entity_type, start, end) tuples and returns precision, recall, and F1. Works with any pipeline that produces character-span predictions: LLM extractors, fine-tuned models, rule-based systems. Configurable overlap strategies, per-type precision/recall targets, and a scalar optimization score for automated prompt engineering.

Inspired by and a generalization of nervaluate.

Installation

pip install spaneval

Relationship to seqeval and nervaluate

seqeval works on IOB/BIO token sequences. LLMs output character spans — converting between the two is lossy and inconvenient.

nervaluate is the closest prior work and covers the SemEval 2013 strategies well. This library extends the same foundation with configurable overlap thresholds, per-type strategy assignment, and score() — a scalar optimization target for automated prompt engineering and hyperparameter search.

Quickstart

from spaneval import evaluate, to_entities

true = to_entities([
    {"entity_type": "PERSON", "start":  0, "end": 10},
    {"entity_type": "ORG",    "start": 34, "end": 44},
])
pred = to_entities([
    {"entity_type": "PERSON", "start":  0, "end":  9},  # slightly off boundary
    {"entity_type": "ORG",    "start": 34, "end": 44},  # exact
])

results = evaluate(true, pred)
results.report()

report() with no arguments shows a ± range across two strategies (Strict / AnyOverlap), giving an instant picture of how boundary precision affects your numbers.

For a guided walkthrough, see the examples:

  • examples/quickstart.py — five steps from zero-config to per-type strategy assignment
  • examples/goals.py — two steps from goal definition to automated prompt optimization

Strategies

Two branches, covered in full in docs/:

Entity-count strategies score each entity as correct, incorrect, missed, or spurious:

Strategy Span Type
Strict Exact boundaries Required
Exact Exact boundaries Ignored
EntType Any overlap Required
Partial Any overlap; 0.5 credit if boundaries differ Ignored
ProportionalCoverage Fraction of true-entity characters covered Ignored
AnyOverlap Any overlap = full credit Ignored
Contains Prediction must fully contain the true span Ignored
MinimumOverlap Configurable threshold and overlap metric Configurable

TextCoverage counts characters rather than entities — useful when the goal is text redaction rather than entity classification.

Per-type strategies and goals

Different entity types can be evaluated under different strategies and held to different targets:

from spaneval.strategies import Strict, ProportionalCoverage

# different strictness per type
results.report(strategy={"PERSON": Strict(), "DATE": ProportionalCoverage()})

# precision/recall targets per type
from spaneval import Goal
goals = {
    "PERSON": Goal(strategy=Strict(),               recall=0.90, precision=0.80),
    "DATE":   Goal(strategy=ProportionalCoverage(),  recall=0.80, precision=0.70),
}
results.report_goals(goals)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spaneval-0.1.0.tar.gz (59.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spaneval-0.1.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file spaneval-0.1.0.tar.gz.

File metadata

  • Download URL: spaneval-0.1.0.tar.gz
  • Upload date:
  • Size: 59.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spaneval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 196db93d9f3271c557768a3f8182fe51b25b4b487232fd36cc8bac49de2e6577
MD5 18141076e239c39dfb51a42378e5af96
BLAKE2b-256 faf581044cbc1e606410b870e987c2d22b588ecff5bb6b07ea55d103fff5acf2

See more details on using hashes here.

Provenance

The following attestation bundles were made for spaneval-0.1.0.tar.gz:

Publisher: publish.yml on jamblejoe/spaneval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spaneval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spaneval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for spaneval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 538d43d57cde208362472d118f1679760ee5a0d09ba6262ca41d0b24a443c53e
MD5 096806e792bf79933c9dd4e11c3797ea
BLAKE2b-256 bd20803c02ca856b42a53de8286d25ab51ce45cd68ebee9b6d97ff9e0c4f89af

See more details on using hashes here.

Provenance

The following attestation bundles were made for spaneval-0.1.0-py3-none-any.whl:

Publisher: publish.yml on jamblejoe/spaneval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page