ragas

No project description provided

Project description

<h1 align="center">
  <img style="vertical-align:middle" height="200"
  src="./docs/assets/logo.png">
</h1>
<p align="center">
  <i>Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines</i>
</p>

<p align="center">
    <a href="https://github.com/explodinggradients/ragas/releases">
        <img alt="GitHub release" src="https://img.shields.io/github/release/explodinggradients/ragas.svg">
    </a>
    <a href="https://www.python.org/">
            <img alt="Build" src="https://img.shields.io/badge/Made%20with-Python-1f425f.svg?color=purple">
    </a>
    <a href="https://github.com/explodinggradients/ragas/blob/master/LICENSE">
        <img alt="License" src="https://img.shields.io/github/license/explodinggradients/ragas.svg?color=green">
    </a>
    <a href="https://colab.research.google.com/github/explodinggradients/ragas/blob/main/docs/quickstart.ipynb">
        <img alt="Open In Colab" src="https://colab.research.google.com/assets/colab-badge.svg">
    </a>
    <a href="https://discord.gg/5djav8GGNZ">
        <img alt="discord-invite" src="https://dcbadge.vercel.app/api/server/5djav8GGNZ?style=flat">
    </a>
    <a href="https://github.com/explodinggradients/ragas/">
        <img alt="Downloads" src="https://badges.frapsoft.com/os/v1/open-source.svg?v=103">
    </a>
</p>

<h4 align="center">
    <p>
        <a href="#shield-installation">Installation</a> |
        <a href="#fire-quickstart">Quickstart</a> |
        <a href="#luggage-metrics">Metrics</a> |
        <a href="#-community">Community</a> |
        <a href="#-open-analytics">Open Analytics</a> |
        <a href="#raising_hand_man-faq">FAQ</a> |
        <a href="https://huggingface.co/explodinggradients">Hugging Face</a>
    <p>
</h4>

> 🚀 Dedicated solutions and support to improve the reliability of RAG systems in production including custom models for production quality monitoring. Contact founders to learn more. [Talk to founders](https://calendly.com/shahules/30min)

ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is where ragas (RAG Assessment) comes in.

ragas provides you with the tools based on the latest research for evaluating LLM-generated text to give you insights about your RAG pipeline. ragas can be integrated with your CI/CD to provide continuous checks to ensure performance.

## :shield: Installation

```bash
pip install ragas
```

if you want to install from source

```bash
git clone https://github.com/explodinggradients/ragas && cd ragas
pip install -e .
```

## :fire: Quickstart

This is a small example program you can run to see ragas in action!

```python

from ragas import evaluate
from datasets import Dataset
import os

os.environ["OPENAI_API_KEY"] = "your-openai-key"

# prepare your huggingface dataset in the format
# Dataset({
#     features: ['question', 'contexts', 'answer'],
#     num_rows: 25
# })

dataset: Dataset

results = evaluate(dataset)
# {'ragas_score': 0.860, 'context_precision': 0.817,
# 'faithfulness': 0.892, 'answer_relevancy': 0.874}
```

If you want a more in-depth explanation of core components, check out our [quick-start notebook](./docs/quickstart.ipynb)

## :luggage: Metrics

Ragas measures your pipeline's performance against different dimensions:

![image](https://github.com/emilesilvis/ragas/assets/557338/b6c0db98-a0a9-4414-9ad3-372d8ceab4c7)

1. **Faithfulness**: measures the information consistency of the generated answer against the given context. If any claims are made in the answer that cannot be deduced from context is penalized. It is calculated from `answer` and `retrieved context`.

2. **Context Precision**: measures how relevant retrieved contexts are to the question. Ideally, the context should only contain information necessary to answer the question. The presence of redundant information in the context is penalized. It is calculated from `question` and `retrieved context`.

3. **Context Recall**: measures the recall of the retrieved context using annotated answer as ground truth. Annotated answer is taken as proxy for ground truth context. It is calculated from `ground truth` and `retrieved context`.

4. **Answer Relevancy**: refers to the degree to which a response directly addresses and is appropriate for a given question or context. This does not take the factuality of the answer into consideration but rather penalizes the present of redundant information or incomplete answers given a question. It is calculated from `question` and `answer`.

5. **Aspect Critiques**: Designed to judge the submission against defined aspects like harmlessness, correctness, etc. You can also define your own aspect and validate the submission against your desired aspect. The output of aspect critiques is always binary. It is calculated from `answer`.

The final `ragas_score` is the harmonic mean of individual metric scores.

To read more about our metrics, check out [docs](/docs/metrics.md).

## 🫂 Community

If you want to get more involved with Ragas, check out our [discord server](https://discord.gg/5djav8GGNZ). It's a fun community where we geek out about LLM, Retrieval, Production issues, and more.

## 🔍 Open Analytics

We track very basic usage metrics to guide us to figure out what our users want, what is working, and what's not. As a young startup, we have to be brutally honest about this which is why we are tracking these metrics. But as an Open Startup, we open-source all the data we collect. You can read more about this [here](https://github.com/explodinggradients/ragas/issues/49). **Ragas does not track any information that can be used to identify you or your company**. You can take a look at exactly what we track in the [code](./src/ragas/_analytics.py)

To disable usage-tracking you set the `RAGAS_DO_NOT_TRACK` flag to true.

## :raising_hand_man: FAQ

1. Why harmonic mean?

Harmonic-Mean penalizes extreme values. For example, if your generated answer is fully factually consistent with the context (faithfulness = 1) but is not relevant to the question (relevancy = 0), a simple average would give you a score of 0.5 but a harmonic mean will give you 0.0

2. How to use Ragas to improve your pipeline?

_"Measurement is the first step that leads to control and eventually to improvement" - James Harrington_

Here we assume that you already have your RAG pipeline ready. When it comes to RAG pipelines, there are mainly two parts - Retriever and generator. A change in any of these should also impact your pipelines' quality.

1. First, decide on one parameter that you're interested in adjusting. for example the number of retrieved documents, K.
2. Collect a set of sample prompts (min 20) to form your test set.
3. Run your pipeline using the test set before and after the change. Each time record the prompts with context and generated output.
4. Run ragas evaluation for each of them to generate evaluation scores.
5. Compare the scores and you will know how much the change has affected your pipelines' performance.

Project details

Release history Release notifications | RSS feed

0.4.3

Jan 13, 2026

0.4.2

Dec 23, 2025

0.4.1

Dec 10, 2025

0.4.0

Dec 3, 2025

0.3.9

Nov 11, 2025

0.3.8

Oct 28, 2025

0.3.7

Oct 14, 2025

0.3.6

Oct 3, 2025

0.3.5

Sep 17, 2025

0.3.5rc2 pre-release

Sep 17, 2025

0.3.5rc1 pre-release

Sep 17, 2025

0.3.4

Sep 10, 2025

0.3.3

Sep 4, 2025

0.3.3rc1 pre-release

Sep 4, 2025

0.3.2

Aug 19, 2025

0.3.2rc3 pre-release

Aug 19, 2025

0.3.2rc2 pre-release

Aug 19, 2025

0.3.2rc1 pre-release

Aug 19, 2025

0.3.1

Aug 11, 2025

0.3.0

Jul 17, 2025

0.3.0rc2 pre-release

Jul 17, 2025

0.2.15

Apr 24, 2025

0.2.14

Mar 4, 2025

0.2.13

Feb 4, 2025

0.2.12

Jan 21, 2025

0.2.11

Jan 14, 2025

0.2.10

Jan 8, 2025

0.2.9

Dec 24, 2024

0.2.8

Dec 10, 2024

0.2.7

Dec 6, 2024

0.2.6

Nov 19, 2024

0.2.5

Nov 12, 2024

0.2.4

Nov 7, 2024

0.2.3

Oct 29, 2024

0.2.2

Oct 22, 2024

0.2.1

Oct 16, 2024

0.2.0

Oct 14, 2024

0.2.0b0 pre-release

Oct 3, 2024

0.1.22

Oct 19, 2024

0.1.21

Oct 3, 2024

0.1.20

Sep 18, 2024

0.1.19

Sep 18, 2024

0.1.18

Sep 11, 2024

0.1.17

Sep 10, 2024

0.1.16

Sep 3, 2024

0.1.15

Aug 27, 2024

0.1.14

Aug 14, 2024

0.1.13

Aug 5, 2024

0.1.12

Jul 30, 2024

0.1.11

Jul 22, 2024

0.1.10

Jul 3, 2024

0.1.9

May 30, 2024

0.1.8

May 21, 2024

0.1.7

Apr 8, 2024

0.1.6

Apr 2, 2024

0.1.5

Mar 20, 2024

0.1.4

Mar 13, 2024

0.1.3

Feb 28, 2024

0.1.2

Feb 23, 2024

0.1.1

Feb 15, 2024

0.1.0

Feb 7, 2024

0.1.0rc1 pre-release

Jan 25, 2024

0.0.22

Dec 13, 2023

0.0.21

Nov 21, 2023

0.0.20

Nov 15, 2023

0.0.19

Oct 31, 2023

0.0.18

Oct 24, 2023

0.0.17

Oct 16, 2023

This version

0.0.16

Sep 28, 2023

0.0.15

Sep 25, 2023

0.0.14

Sep 15, 2023

0.0.13

Sep 15, 2023

0.0.12

Sep 6, 2023

0.0.11

Aug 24, 2023

0.0.10

Aug 2, 2023

0.0.9

Jul 27, 2023

0.0.8 yanked

Jul 27, 2023

Reason this release was yanked:

bug in the code

0.0.7

Jul 20, 2023

0.0.6

Jul 15, 2023

0.0.5

Jul 10, 2023

0.0.4

Jul 10, 2023

0.0.3

Jun 9, 2023

0.0.3rc1 pre-release

Jun 9, 2023

0.0.2

May 23, 2023

0.0.1

May 14, 2023

0.0.1a7 pre-release

May 14, 2023

0.0.1a6 pre-release

May 14, 2023

0.0.1a5 pre-release

May 14, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragas-0.0.16.tar.gz (1.4 MB view details)

Uploaded Sep 28, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragas-0.0.16-py3-none-any.whl (38.5 kB view details)

Uploaded Sep 28, 2023 Python 3

File details

Details for the file ragas-0.0.16.tar.gz.

File metadata

Download URL: ragas-0.0.16.tar.gz
Upload date: Sep 28, 2023
Size: 1.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for ragas-0.0.16.tar.gz
Algorithm	Hash digest
SHA256	`b44f3f025e8e76e7628012f2132274b8765b0014e8af25672731c66e8b95dbba`
MD5	`ad7af4cf4e76c51e295a83420d3c11d4`
BLAKE2b-256	`44a88f071d148bc3f35fff8611378d6008e2c58e696d6d73ed89881027d4a772`

See more details on using hashes here.

File details

Details for the file ragas-0.0.16-py3-none-any.whl.

File metadata

Download URL: ragas-0.0.16-py3-none-any.whl
Upload date: Sep 28, 2023
Size: 38.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for ragas-0.0.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ece80ac8f9b87842b84e272f13942715d86c35417625b20a45e7c1acfb99844a`
MD5	`75696fa9377ac6280e8822f1f95bced6`
BLAKE2b-256	`24993d07f2377b9776363b7a414665427061ec6db87449743c213fe1d32253ca`

See more details on using hashes here.

ragas 0.0.16

Navigation

Verified details

Maintainers

Unverified details

Project description

Project details

Verified details

Maintainers

Unverified details

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes