Gaussia - AI evaluation framework for measuring fairness, quality, and safety of AI models and assistants
Project description
Gaussia
AI evaluation framework for measuring fairness, quality, and safety of AI models and assistants.
Installation
pip install gaussia
With specific metric dependencies:
pip install gaussia[toxicity] # Toxicity analysis
pip install gaussia[bias] # Bias detection
pip install gaussia[metrics] # All metrics
pip install gaussia[all] # Everything
Quick Start
from gaussia import Retriever, Dataset, Batch
from gaussia.metrics import Context
# 1. Define your data source
class MyRetriever(Retriever):
def load_dataset(self) -> list[Dataset]:
return [
Dataset(
session_id="session-1",
assistant_id="assistant-1",
language="en",
context="France is a country in Western Europe.",
conversation=[
Batch(
qa_id="q1",
query="Where is France?",
assistant="France is located in Western Europe.",
ground_truth_assistant="France is a country in Western Europe.",
)
],
)
]
# 2. Run a metric
metrics = Context.run(retriever=MyRetriever())
Metrics
| Metric | Description | Install extra |
|---|---|---|
| Context | Evaluates response alignment with provided context | — |
| Conversational | Dialogue quality via Grice's maxims (memory, language, quality, quantity, relation, manner) | — |
| BestOf | King-of-the-hill tournament comparison of multiple assistants | — |
| Agentic | Agent evaluation with pass@K and tool correctness | — |
| Toxicity | Cluster-based toxicity profiling with demographic and sentiment analysis | [toxicity] |
| Bias | Bias detection across protected attributes using guardians | [bias] |
| Humanity | Emotion, empathy, and human-like quality analysis | [humanity] |
| Regulatory | Compliance evaluation against regulatory documents | [regulatory] |
| VisionSimilarity | VLM description comparison via semantic similarity | [vision] |
| VisionHallucination | Hallucination detection in VLM outputs | [vision] |
Features
Guardians
Pluggable bias detection backends:
from gaussia.guardians import IBMGraniteGuardian, LLamaGuardGuardian
metrics = Bias.run(retriever=MyRetriever(), guardian=IBMGraniteGuardian())
Statistical Modes
Choose between frequentist and Bayesian aggregation:
from gaussia import FrequentistMode, BayesianMode
metrics = Context.run(retriever=MyRetriever(), statistical_mode=FrequentistMode())
metrics = Context.run(retriever=MyRetriever(), statistical_mode=BayesianMode())
Synthetic Data Generation
Generate evaluation datasets from documents:
from gaussia.generators import BaseGenerator, create_markdown_loader
loader = create_markdown_loader(path="./docs")
generator = BaseGenerator(context_loader=loader)
datasets = generator.generate()
Explainability
Token-level attribution analysis:
from gaussia.explainability import AttributionExplainer
explainer = AttributionExplainer(method="lime")
attributions = explainer.explain(text="Your input text")
Prompt Optimization
Optimize prompts using evolutionary and multi-objective strategies:
from gaussia.prompt_optimizer import GEPAOptimizer, MIPROv2Optimizer
Documentation
Full documentation available at docs.gaussia.ai.
Requirements
- Python >= 3.11
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gaussia-1.0.0.tar.gz.
File metadata
- Download URL: gaussia-1.0.0.tar.gz
- Upload date:
- Size: 770.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a85d40c3188fcd2a430236373ad717f91c8907d56ff5468c50d40a78704050f7
|
|
| MD5 |
44c64572b5cfc8f2761338424106ffb9
|
|
| BLAKE2b-256 |
846f1585f2a21c61b77ec4c4d5d694a79401a4b7d6a711e4667d4fd4065478ba
|
File details
Details for the file gaussia-1.0.0-py3-none-any.whl.
File metadata
- Download URL: gaussia-1.0.0-py3-none-any.whl
- Upload date:
- Size: 813.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e117475baa3769801fd9cb9148c24105313691e57ef671952ca39edef0f5fe1
|
|
| MD5 |
a76d21cfcdc2e35117e5a318e085213a
|
|
| BLAKE2b-256 |
273006e092b9a5a247cd7f4e09576746bc99257dae1731266160c0ec5517cf84
|