Skip to main content

Python package for evaluation of retrieval-augmented generation (RAG) models

Project description

Indomee

Indomee is a Python package designed to simplify the evaluation of retrieval-augmented generation (RAG) models and other retrieval-based systems. With indomee, you can compute common evaluation metrics like recall and mean reciprocal rank (MRR) at various levels of k, all through a straightforward API.

We also provide support for simple bootstrapping at the moment with t-testing coming soon.

Installation

pip install indomee

You can get started with indomee with the following example

Indomee provides functions to calculate various metrics such as Mean Reciprocal Rank (MRR) and Recall.

Example Usage

from indomee import calculate_mrr, calculate_recall, calculate_metrics_at_k

mrr = calculate_mrr([1, 2, 3], [2, 3, 4])
print("MRR:", mrr)
# > MRR: 0.5

# Calculate Recall
recall = calculate_recall([1, 2, 3], [2])
print("Recall:", recall)
# > Recall: 1

# Calculate metrics at specific k values
metrics = calculate_metrics_at_k(
    metrics=["recall"], preds=[1, 2, 3], labels=[2], k=[1, 2, 3]
)
print("Metrics at k:", metrics)
# > {'recall@1': 0.0, 'recall@2': 1.0, 'recall@3': 1.0}

2. Bootstrapping

Indomee also supports bootstrapping for more robust metric evaluation.

Example Usage

from indomee import bootstrap_sample, bootstrap

# Bootstrapping a sample
result = bootstrap_sample(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Sample Metrics:", result.sample_metrics)

# Bootstrapping multiple samples
result = bootstrap(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, n_iterations=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Metrics:", result.sample_metrics)

3. T-Testing

For the last portion, we'll show how to perform a t-test between two different results that we've obtained from the different methods.

from indomee import perform_t_tests
import pandas as pd

df = pd.read_csv("./data.csv")

# Calculate the mean for each method
method_1 = df["method_1"].tolist()
method_2 = df["method_2"].tolist()
baseline = df["baseline"].tolist()

results = perform_t_tests(
    baseline, method_1, method_2,
    names=["Baseline", "Method 1", "Method 2"],
    paired=True,
)
results

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

indomee-0.1.5.tar.gz (67.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

indomee-0.1.5-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file indomee-0.1.5.tar.gz.

File metadata

  • Download URL: indomee-0.1.5.tar.gz
  • Upload date:
  • Size: 67.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for indomee-0.1.5.tar.gz
Algorithm Hash digest
SHA256 14482bd3b7b98a1f18848525875be4272770ac9282ef4175a1a15c7f058e1bf7
MD5 c03226d2149fe596b7384e093d120ded
BLAKE2b-256 f5e96a88eea7701d219b9091f5b641d3417faeb2d5cc4e045431bb4642857a6e

See more details on using hashes here.

File details

Details for the file indomee-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: indomee-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.11

File hashes

Hashes for indomee-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 bb4703b3743318da76e2db0c17e3f15e1d1af09b04c0a56117dd99b957f6a6c8
MD5 c1025deb96de54c4bb87193277b80999
BLAKE2b-256 15934b50f7adf20660178d2e568b7e3a330568da4b68f2711e137a15e9c5e956

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page