State-of-the-art uncertainty quantification methods for large language models.
Project description
omniuq
State-of-the-art uncertainty quantification methods for large language models.
omniuq brings together rigorous, paper-faithful
implementations of methods that measure when an
LLM is unsure and why.
Install
pip install omniuq
For low-VRAM setups (e.g. Phi-4 14B on a 24 GB card), enable quantization:
pip install "omniuq[quantize]"
You'll need an OpenAI API key for the clarifier and judge:
export OPENAI_API_KEY=sk-...
AU / EU
Every uncertain LLM answer has two possible causes.
omniuq separates them.
Aleatoric Uncertainty (AU) — uncertainty from the input itself. The question is ambiguous, underspecified, or has multiple valid interpretations. Cannot be reduced by a stronger model.
"Who won the World Series?" — high AU. Depends on year, league, team vs. player.
Epistemic Uncertainty (EU) — uncertainty from the model's lack of knowledge. The question is clear; the model just doesn't know. Can be reduced with retrieval, fine-tuning, or a stronger model.
"What is the capital of Wakanda?" — high EU. Question is clear; the model has no real answer.
Total = AU + EU.
Methods
| Method | Decomposes | Paper | Code | Reproduced | Status |
|---|---|---|---|---|---|
| Spectral Uncertainty (Walha et al., AAAI 2026) | AU + EU | arXiv | GitHub | TriviaQA: AUROC 89.66% vs. paper 91.92% — Colab | ✅ Available |
Demo 1 — Spectral Uncertainty
Three ways to run it.
Paper-faithful
Phi-4 14B as target, GPT-4o as clarifier, GPT-4.1 as judge — exactly the paper's setup.
import os
from omniuq import (
SpectralUncertainty,
load_llm_model,
load_openai_client,
)
tokenizer, model = load_llm_model("microsoft/phi-4")
clarifier = load_openai_client(
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4o",
)
judge = load_openai_client(
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4.1",
)
uq = SpectralUncertainty(
tokenizer, model,
clarifier=clarifier,
judge=judge,
)
print(uq.score("What is the capital of France?"))
Mixed: local target + OpenAI clarifier/judge
Smaller local model for sampling, GPT-4o for high-quality clarifications.
import os
from omniuq import (
SpectralUncertainty,
load_llm_model,
load_openai_client,
)
tokenizer, model = load_llm_model(
"Qwen/Qwen2.5-7B-Instruct"
)
clarifier = load_openai_client(
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4o",
)
judge = load_openai_client(
api_key=os.environ["OPENAI_API_KEY"],
model="gpt-4.1",
)
uq = SpectralUncertainty(
tokenizer, model,
clarifier=clarifier,
judge=judge,
)
print(uq.score("What is the capital of France?"))
Fully local — no API calls
Same HuggingFace model used as target, clarifier, and judge. No OpenAI key needed.
from omniuq import SpectralUncertainty, load_llm_model
tokenizer, model = load_llm_model(
"meta-llama/Llama-3.1-8B-Instruct"
)
uq = SpectralUncertainty(
tokenizer, model,
clarifier=(tokenizer, model),
judge=(tokenizer, model),
)
print(uq.score("What is the capital of France?"))
Note: smaller open models tend to produce noisier clarifications than GPT-4o, so AU scores will be less reliable in fully-local mode.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omniuq-0.2.0.tar.gz.
File metadata
- Download URL: omniuq-0.2.0.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
027c1513a5967c83463a0679c08d29433a720938e5dcff31553515ff1cbfe8a6
|
|
| MD5 |
f2524379d54945ead0c4f539cea5fa3a
|
|
| BLAKE2b-256 |
f09abe12e49e8be2beb217a58e2d231d3c05db94c975f0d3267d634bb58da4da
|
File details
Details for the file omniuq-0.2.0-py3-none-any.whl.
File metadata
- Download URL: omniuq-0.2.0-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1964bcb20f8d808642a000613dc2e9ac4646d022fe8032e36aef4bf033b9597a
|
|
| MD5 |
fa86cd874790bb883c3cf09d51c5134f
|
|
| BLAKE2b-256 |
68c73c39727fa330d5766328ad6c44d90dcca62f204259130af55979730a183a
|