A Hugging Face-native text scoring package for multi-dimensional quality evaluation.
Project description
🎯 omniscore
omniscore is a lightweight Python package for evaluating the quality of Natural Language Generation (NLG) and generated text. Whether you are evaluating Question Answering (QA), text summarization, explanations, or LLM chat interactions, omniscore is designed to integrate seamlessly with OmniScore models.
✨ Key Features
- Hugging Face Native: Includes custom
OmniScoreConfigandOmniScoreModelclasses that fit right into your existing Hugging Face workflows. - Easy-to-Use API: High-level
OmniScorerandscore(...)API for processing single examples or large batches with minimal boilerplate. - Command-Line Interface (CLI): Built-in CLI for quickly scoring local text files or one-off strings directly from your terminal.
- Standard Serialization: Full checkpoint compatibility using standard
save_pretrained(...)andfrom_pretrained(...)methods. - Secure Loading: Load native
omniscorecheckpoints safely without needingtrust_remote_code=True. - Backward Compatibility: Built-in support for legacy
score_predictorcheckpoints (such asQCRI/OmniScore-deberta-v3).
📑 Table of Contents
- Installation
- Quickstart: Python API
- Pre-trained Models
- Command Line Interface (CLI)
- Advanced: Hosting Custom Models
- Resources & Tutorials
🚀 Installation
You can install omniscore directly via pip. The runtime dependencies include sentencepiece to ensure models like DeBERTa-v3 load cleanly.
pip install omniscore
(Note: If installing from source for development, run python3 -m pip install -e . in the repository root.)
💻 Quickstart: Python API
Method 1: The score() Function
For one-off evaluations, you can use the high-level score function. It loads the model, processes the inputs, and returns the results.
from omniscore import score
result = score(
predictions=["A generated summary."],
references=["A gold summary."],
sources=["The source document."],
tasks=["summarization"],
model_name_or_path="your-org/omniscore-base",
)
print("Individual Scores:", result.to_list())
print("Mean Score:", result.mean())
Method 2: The OmniScorer Class
If you are evaluating multiple batches or running a server, use OmniScorer to keep the model loaded in memory for much faster inference.
from omniscore import OmniScorer
scorer = OmniScorer("your-org/omniscore-base")
result = scorer.score(
predictions=["Candidate 1", "Candidate 2"],
references=["Reference 1", "Reference 2"],
sources=["Source 1", "Source 2"],
tasks=["summarization", "summarization"],
)
print(result.to_list())
🧠 Pre-trained Models
omniscore includes metadata for known hosted models and gracefully handles remote-code loading paths internally.
Using QCRI/OmniScore-deberta-v3
Here is an example evaluating a headline using the legacy score_predictor checkpoint:
from omniscore import OmniScorer, get_example
# Load built-in examples for the model
example = get_example("QCRI/OmniScore-deberta-v3")
# Initialize the scorer
scorer = OmniScorer("QCRI/OmniScore-deberta-v3")
result = scorer.score(
predictions=example.prediction,
sources=example.source,
references=example.reference,
tasks=example.task,
)
print(result.to_list())
Equivalent explicit example, matching the Hugging Face model card:
result = scorer.score(
predictions="Microsoft releases detailed model documentation.",
sources="Full article text goes here.",
tasks="headline_evaluation",
)
⌨️ Command Line Interface (CLI)
Evaluate text rapidly without writing Python scripts.
Score a single example:
omniscore \
--model QCRI/OmniScore-deberta-v3 \
--prediction "Microsoft releases detailed model documentation." \
--source "Full article text goes here." \
--task headline_evaluation \
--pretty
Score batches from files:
omniscore \
--model QCRI/OmniScore-deberta-v3 \
--predictions-file predictions.txt \
--references-file references.txt \
--sources-file sources.txt \
--tasks-file tasks.txt \
--pretty
🛠 Advanced: Hosting Custom Models
omniscore is built around the standard Transformers save/load flow. You can easily adapt a backbone model and push it to the Hugging Face Hub.
1. Build and Save Locally
from transformers import AutoTokenizer
from omniscore import OmniScoreModel
# Initialize an OmniScore model from a standard backbone
model = OmniScoreModel.from_backbone(
"distilroberta-base",
score_names=["quality", "faithfulness"],
task_prefix="Task:",
source_prefix="Document:",
reference_prefix="Reference:",
prediction_prefix="Summary:",
)
tokenizer = AutoTokenizer.from_pretrained("distilroberta-base")
# Save standard checkpoints
model.save_pretrained("omniscore-checkpoint")
tokenizer.save_pretrained("omniscore-checkpoint")
2. Push to Hugging Face
Upload your folder to Hugging Face using standard transformers methods:
model.push_to_hub("your-org/omniscore-base")
tokenizer.push_to_hub("your-org/omniscore-base")
Users can now load your model instantly using OmniScorer("your-org/omniscore-base").
📚 Resources & Tutorials
Check out our clean, Colab-oriented example notebook located at:
📁 examples/omniscore_qcri_deberta_v3_colab.ipynb
This notebook walks you through:
- Verifying your GPU runtime.
- Loading the
QCRI/OmniScore-deberta-v3model. - Running documented examples via
OmniScorer. - Inspecting the returned score data structure.
Supported Families
- Native
omniscorecheckpoints saved fromOmniScoreModel. - Legacy
score_predictorcheckpoints (e.g.,QCRI/OmniScore-deberta-v3).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omniscore-0.1.0.tar.gz.
File metadata
- Download URL: omniscore-0.1.0.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5813f80e6d7ff40123ff0985282cce1b75633babb304f18a33ddebb0a8105755
|
|
| MD5 |
a2c3f55b182135dc8c0e6c3aa0f4f4ff
|
|
| BLAKE2b-256 |
16548d296b974c0f6a67149282e29dfbd5b2e9bd806f3f32a6e05dbaac5ca390
|
File details
Details for the file omniscore-0.1.0-py3-none-any.whl.
File metadata
- Download URL: omniscore-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
946cdb7d97f37baa5a71df6e7c07c0b9ed33977bf8c6a69ef16e55b0d38d3c98
|
|
| MD5 |
673de3b8ac23ecb7d98bc746324be5e3
|
|
| BLAKE2b-256 |
f95fc63b78b8302ba91fc1b1cc3f0b94a5b54b80017d3f513043665509d0d535
|