Skip to main content

Gaussia - AI evaluation framework for measuring fairness, quality, and safety of AI models and assistants

Project description

Gaussia

PyPI version PyPI - Python Version PyPI - Downloads PyPI - License

AI evaluation framework for measuring fairness, quality, and safety of AI models and assistants.

Installation

pip install gaussia

With specific metric dependencies:

pip install gaussia[toxicity]    # Toxicity analysis
pip install gaussia[bias]        # Bias detection
pip install gaussia[evalhub]     # EvalHub provider adapter
pip install gaussia[metrics]     # All metrics
pip install gaussia[all]         # Everything

Quick Start

from gaussia import Retriever, Dataset, Batch
from gaussia.metrics import Context

# 1. Define your data source
class MyRetriever(Retriever):
    def load_dataset(self) -> list[Dataset]:
        return [
            Dataset(
                session_id="session-1",
                assistant_id="assistant-1",
                language="en",
                context="France is a country in Western Europe.",
                conversation=[
                    Batch(
                        qa_id="q1",
                        query="Where is France?",
                        assistant="France is located in Western Europe.",
                        ground_truth_assistant="France is a country in Western Europe.",
                    )
                ],
            )
        ]

# 2. Run a metric
metrics = Context.run(retriever=MyRetriever())

Metrics

Metric Description Install extra
Context Evaluates response alignment with provided context
Conversational Dialogue quality via Grice's maxims (memory, language, quality, quantity, relation, manner)
BestOf King-of-the-hill tournament comparison of multiple assistants
Agentic Agent evaluation with pass@K and tool correctness
Toxicity Cluster-based toxicity profiling with demographic and sentiment analysis [toxicity]
Bias Bias detection across protected attributes using guardians [bias]
Humanity Emotion, empathy, and human-like quality analysis [humanity]
Regulatory Compliance evaluation against regulatory documents [regulatory]
VisionSimilarity VLM description comparison via semantic similarity [vision]
VisionHallucination Hallucination detection in VLM outputs [vision]

Features

Guardians

Pluggable bias detection backends:

from gaussia.guardians import IBMGraniteGuardian, LLamaGuardGuardian

metrics = Bias.run(retriever=MyRetriever(), guardian=IBMGraniteGuardian())

Statistical Modes

Choose between frequentist and Bayesian aggregation:

from gaussia import FrequentistMode, BayesianMode

metrics = Context.run(retriever=MyRetriever(), statistical_mode=FrequentistMode())
metrics = Context.run(retriever=MyRetriever(), statistical_mode=BayesianMode())

Synthetic Data Generation

Generate evaluation datasets from documents:

from gaussia.generators import BaseGenerator, create_markdown_loader

loader = create_markdown_loader(path="./docs")
generator = BaseGenerator(context_loader=loader)
datasets = generator.generate()

Explainability

Token-level attribution analysis:

from gaussia.explainability import AttributionExplainer

explainer = AttributionExplainer(method="lime")
attributions = explainer.explain(text="Your input text")

Prompt Optimization

Optimize prompts using evolutionary and multi-objective strategies:

from gaussia.prompt_optimizer import GEPAOptimizer, MIPROv2Optimizer

EvalHub Provider

Run Gaussia as an EvalHub BYOF provider:

python -m gaussia.integrations.evalhub.adapter

Documentation

Full documentation available at docs.gaussia.ai.

Requirements

  • Python >= 3.11

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gaussia-1.0.0b2.tar.gz (794.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gaussia-1.0.0b2-py3-none-any.whl (833.4 kB view details)

Uploaded Python 3

File details

Details for the file gaussia-1.0.0b2.tar.gz.

File metadata

  • Download URL: gaussia-1.0.0b2.tar.gz
  • Upload date:
  • Size: 794.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gaussia-1.0.0b2.tar.gz
Algorithm Hash digest
SHA256 2a34f3793789c1001778a86cacfc08613a500d9a242e3aae10164b61b14d84f7
MD5 d40b129a8080b4ff9676000298efcaf5
BLAKE2b-256 46fe7e3674a976f6852ba6218f265447561f33c9bff6a261d0b93571c4622ab3

See more details on using hashes here.

File details

Details for the file gaussia-1.0.0b2-py3-none-any.whl.

File metadata

  • Download URL: gaussia-1.0.0b2-py3-none-any.whl
  • Upload date:
  • Size: 833.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gaussia-1.0.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 f7f394d7eb031b78ae7f529cc2315d708583528bd9afaee37c5c3a6ddaa8d0a1
MD5 5a6916a1f99824007e253b5a335ba8e4
BLAKE2b-256 bbfe40c17d263f83b2e68b0273ebaf56141263df196e86594c356375a4e5933e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page