Skip to main content

A Hugging Face-native text scoring package for multi-dimensional quality evaluation.

Project description

🎯 omniscore

Python Version Hugging Face

omniscore is a lightweight Python package for evaluating the quality of Natural Language Generation (NLG) and generated text. Whether you are evaluating Question Answering (QA), text summarization, explanations, or LLM chat interactions, omniscore is designed to integrate seamlessly with OmniScore models.

✨ Key Features

  • Hugging Face Native: Includes custom OmniScoreConfig and OmniScoreModel classes that fit right into your existing Hugging Face workflows.
  • Easy-to-Use API: High-level OmniScorer and score(...) API for processing single examples or large batches with minimal boilerplate.
  • Command-Line Interface (CLI): Built-in CLI for quickly scoring local text files or one-off strings directly from your terminal.
  • Standard Serialization: Full checkpoint compatibility using standard save_pretrained(...) and from_pretrained(...) methods.
  • Secure Loading: Load native omniscore checkpoints safely without needing trust_remote_code=True.
  • Backward Compatibility: Built-in support for legacy score_predictor checkpoints (such as QCRI/OmniScore-deberta-v3).

📑 Table of Contents

  1. Installation
  2. Quickstart: Python API
  3. Pre-trained Models
  4. Command Line Interface (CLI)
  5. Advanced: Hosting Custom Models
  6. Resources & Tutorials

🚀 Installation

You can install omniscore directly via pip. The runtime dependencies include sentencepiece to ensure models like DeBERTa-v3 load cleanly.

pip install omniscore

(Note: If installing from source for development, run python3 -m pip install -e . in the repository root.)


💻 Quickstart: Python API

Method 1: The score() Function

For one-off evaluations, you can use the high-level score function. It loads the model, processes the inputs, and returns the results.

from omniscore import score

result = score(
    predictions=["A generated summary."],
    references=["A gold summary."],
    sources=["The source document."],
    tasks=["summarization"],
    model_name_or_path="your-org/omniscore-base",
)

print("Individual Scores:", result.to_list())
print("Mean Score:", result.mean())

Method 2: The OmniScorer Class

If you are evaluating multiple batches or running a server, use OmniScorer to keep the model loaded in memory for much faster inference.

from omniscore import OmniScorer

scorer = OmniScorer("your-org/omniscore-base")

result = scorer.score(
    predictions=["Candidate 1", "Candidate 2"],
    references=["Reference 1", "Reference 2"],
    sources=["Source 1", "Source 2"],
    tasks=["summarization", "summarization"],
)

print(result.to_list())

🧠 Pre-trained Models

omniscore includes metadata for known hosted models and gracefully handles remote-code loading paths internally.

Using QCRI/OmniScore-deberta-v3

Here is an example evaluating a headline using the legacy score_predictor checkpoint:

from omniscore import OmniScorer, get_example

# Load built-in examples for the model
example = get_example("QCRI/OmniScore-deberta-v3")

# Initialize the scorer
scorer = OmniScorer("QCRI/OmniScore-deberta-v3")

result = scorer.score(
    predictions=example.prediction,
    sources=example.source,
    references=example.reference,
    tasks=example.task,
)

print(result.to_list())

Equivalent explicit example, matching the Hugging Face model card:

result = scorer.score(
    predictions="Microsoft releases detailed model documentation.",
    sources="Full article text goes here.",
    tasks="headline_evaluation",
)

⌨️ Command Line Interface (CLI)

Evaluate text rapidly without writing Python scripts.

Score a single example:

omniscore \
  --model QCRI/OmniScore-deberta-v3 \
  --prediction "Microsoft releases detailed model documentation." \
  --source "Full article text goes here." \
  --task headline_evaluation \
  --pretty

Score batches from files:

omniscore \
  --model QCRI/OmniScore-deberta-v3 \
  --predictions-file predictions.txt \
  --references-file references.txt \
  --sources-file sources.txt \
  --tasks-file tasks.txt \
  --pretty

🛠 Advanced: Hosting Custom Models

omniscore is built around the standard Transformers save/load flow. You can easily adapt a backbone model and push it to the Hugging Face Hub.

1. Build and Save Locally

from transformers import AutoTokenizer
from omniscore import OmniScoreModel

# Initialize an OmniScore model from a standard backbone
model = OmniScoreModel.from_backbone(
    "distilroberta-base",
    score_names=["quality", "faithfulness"],
    task_prefix="Task:",
    source_prefix="Document:",
    reference_prefix="Reference:",
    prediction_prefix="Summary:",
)

tokenizer = AutoTokenizer.from_pretrained("distilroberta-base")

# Save standard checkpoints
model.save_pretrained("omniscore-checkpoint")
tokenizer.save_pretrained("omniscore-checkpoint")

2. Push to Hugging Face

Upload your folder to Hugging Face using standard transformers methods:

model.push_to_hub("your-org/omniscore-base")
tokenizer.push_to_hub("your-org/omniscore-base")

Users can now load your model instantly using OmniScorer("your-org/omniscore-base").


📚 Resources & Tutorials

Check out our clean, Colab-oriented example notebook located at: 📁 examples/omniscore_qcri_deberta_v3_colab.ipynb

This notebook walks you through:

  • Verifying your GPU runtime.
  • Loading the QCRI/OmniScore-deberta-v3 model.
  • Running documented examples via OmniScorer.
  • Inspecting the returned score data structure.

Supported Families

  • Native omniscore checkpoints saved from OmniScoreModel.
  • Legacy score_predictor checkpoints (e.g., QCRI/OmniScore-deberta-v3).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniscore-0.1.0.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniscore-0.1.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file omniscore-0.1.0.tar.gz.

File metadata

  • Download URL: omniscore-0.1.0.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for omniscore-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5813f80e6d7ff40123ff0985282cce1b75633babb304f18a33ddebb0a8105755
MD5 a2c3f55b182135dc8c0e6c3aa0f4f4ff
BLAKE2b-256 16548d296b974c0f6a67149282e29dfbd5b2e9bd806f3f32a6e05dbaac5ca390

See more details on using hashes here.

File details

Details for the file omniscore-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: omniscore-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for omniscore-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 946cdb7d97f37baa5a71df6e7c07c0b9ed33977bf8c6a69ef16e55b0d38d3c98
MD5 673de3b8ac23ecb7d98bc746324be5e3
BLAKE2b-256 f95fc63b78b8302ba91fc1b1cc3f0b94a5b54b80017d3f513043665509d0d535

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page