Skip to main content

A Python package for evaluating LLM application outputs.

Project description

GroundedAI

Overview

The grounded-ai package is a powerful tool developed by GroundedAI to evaluate the performance of large language models (LLMs) and their applications. It leverages small language models and adapters to compute various metrics, providing insights into the quality and reliability of LLM outputs.

Features

  • Metric Evaluation: Compute a wide range of metrics to assess the performance of LLM outputs, including:

    • Factual accuracy
    • Relevance to the given context
    • Potential biases or toxicity
    • Hallucination
  • Small Language Model Integration: Utilize state-of-the-art small language models, optimized for efficient evaluation tasks, to analyze LLM outputs accurately and quickly.

  • Adapter Support: Leverage GroundedAI's proprietary adapters, such as the phi3-toxicity-judge adapter, to fine-tune the small language models for specific domains, tasks, or evaluation criteria, ensuring tailored and precise assessments.

  • Flexible Input/Output Handling: Accept LLM outputs in various formats (text, JSON, etc.) and provide evaluation results in a structured and easily consumable manner.

  • Customizable Evaluation Pipelines: Define and configure evaluation pipelines to combine multiple metrics, weights, and thresholds based on your specific requirements.

  • Reporting and Visualization: Generate comprehensive reports and visualizations to communicate evaluation results effectively, facilitating decision-making and model improvement processes.

Getting Started

Install the grounded-ai package:

pip install grounded-ai

Example Usage: Toxicity Evaluation

The ToxicityEvaluator class is used to evaluate the toxicity of a given text. Here's an example of how to use it:

from grounded_ai.evaluators.toxicity_evaluator import ToxicityEvaluator

base_model_id = "microsoft/Phi-3-mini-4k-instruct"
groundedai_eval_id = "grounded-ai/phi3-toxicity-judge"

# Initialize the ToxicityEvaluator
evaluator = ToxicityEvaluator(base_model_id, groundedai_eval_id)

# Load the base model and the GroundedAI adapter
evaluator.load_model(base_model_id, groundedai_eval_id)

# Define the data to be evaluated
data = [
    "That guy is so stupid and ugly",
    "Bunnies are so fluffy and cute"
]

# Evaluate the data for toxicity
response = evaluator.evaluate(data)

# Output
# {'toxic': 1, 'non-toxic': 1, 'percentage_toxic': 50.0}
print(response)

In this example, we initialize the ToxicityEvaluator with the base model ID (microsoft/Phi-3-mini-4k-instruct) and the GroundedAI adapter ID (grounded-ai/phi3-toxicity-judge). The quantization parameter is set to True to enable quantization for faster inference with less memory.

We then load the base model and the GroundedAI adapter using the load_model method.

Next, we define a list of texts (data) that we want to evaluate for toxicity.

Finally, we call the evaluate method with the data list, and it returns a dictionary containing the number of toxic and non-toxic texts, as well as the percentage of toxic texts.

In the output, we can see that out of the two texts, one is classified as toxic, and the other as non-toxic, resulting in a 50% toxicity percentage.

Documentation

Detailed documentation, including API references, examples, and guides, coming soon at https://groundedai.tech/api.

Contributing

We welcome contributions from the community! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request on the GroundedAI grounded-eval GitHub repository.

License

The grounded-ai package is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grounded_ai-0.0.4.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

grounded_ai-0.0.4-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file grounded_ai-0.0.4.tar.gz.

File metadata

  • Download URL: grounded_ai-0.0.4.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for grounded_ai-0.0.4.tar.gz
Algorithm Hash digest
SHA256 9da0877ff72be65d3f823d36d042c0d8ff6a3b504250b0fabae859d7507ecee9
MD5 b2942a476576522d1aba26218e75faab
BLAKE2b-256 c0caf8916ca1594bd6905bd04b900a6a2f588af3abc3e97e45c5e878f270a1a2

See more details on using hashes here.

File details

Details for the file grounded_ai-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: grounded_ai-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for grounded_ai-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8b55493362f372a97fa70122a816e7465c2c38c9335977dc45bd62f41aabf17e
MD5 3ec6746d3771f0eb17066708145bf0c5
BLAKE2b-256 a76a5b24a0aa5e6f2be2c3e185c5cf3e4e3c1facf65a6796efbf70487337ca39

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page