A library for generating dataset and evaluating these datasets on RAG based solutions

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sprauej

These details have not been verified by PyPI

Project description

RAGFORmance : Benchmark generators for RAG

📚 Explore RAGFORmance docs »

RAGFORMance is a library for generating benchmarks for Retrieval Augmented Generation systems.

RAGFORMance wraps multiple question/answers dataset generators, such as RAGAS, DeepEval or Your-Bench, as well as proposing different types of generators relevant for testing industrial use cases. Some generators are using LLMs, some are relying on custom logic.

RAGFORMancealso wraps multiple connectors to well known RAG system to be testes, such as OpenWebUI, Haystack, Ragflow, or custom developments on langchain or llama index.

Finally, RAGFORMance offers different metrics by wrapping to state of the art libraries such as TrecEval, LLM metrics from RAGAS or DeepEval, and proposing custom metrics and visualization that are relevant for different types of RAG systems.

Installation

Install the library using pip: pip install ragformance or pip install ragformance[all] to install all the generators, RAG wrappers and metrics (including wrappers to RAGAS and DeepEval)

Usage

Usage as a library

The library contains 4 types of functions :

Generators that take document as inputs and generates different types of evaluation datasets
Data Loaders that convert well known dataset formats from and to the RAGFORmance format
RAG wrappers that automatically runs the evaluations on a given RAG chain
Metrics that evaluates both the Retrieval capabilities and the end answer

Complete exemples can be found in the documentation, here is a code snippet that should run after installation.

from ragformance.dataloaders import load_beir_dataset

from ragformance.rag.naive_rag import NaiveRag
from ragformance.rag.config import NaiveRagConfig


corpus, queries = load_beir_dataset(filter_corpus = True)

config = NaiveRagConfig(EMBEDDING_MODEL = "all-MiniLM-L6-v2")

naive_rag = NaiveRag(config)
doc_uploaded_num = naive_rag.upload_corpus(corpus)
answers = naive_rag.ask_queries(queries)


from ragformance.eval import trec_eval_metrics
from ragformance.eval import visualize_semantic_F1, display_semantic_quadrants

metrics = trec_eval_metrics(answers)

quadrants = visualize_semantic_F1(corpus, answers, embedding_config={"model": "all-MiniLM-L6-v2"})

display_semantic_quadrants(quadrants)

Usage as a CLI or python pipeline

The second way to use RAGformance is as a standalone programme or executable, through the command-line interface (CLI) with a configuration file. After installation with pip or through the pre-compiled libraries available on github, you can run the following command :

ragformance --config your_config.json

This corresponds to the following python code :

from ragformance.cli.run import run_pipeline

corpus, queries, answers, metrics_data, display_widget = run_pipeline("config.json")

Configuring the pipeline

Data generation and pipelines in particular are controlled via the generation section within your your_config.json file. Here is an exemple that reproduces the same execution as above : loading the BEIR dataset, testing on Naive_RAG and generating metrics and visualizations

Example config.json snippet for data generation:

{
    "generation": {
        "type": "alpha",
        "source": {
            "path": "path/to/your/input_data"
        },
        "output": {
            "path": "path/to/your/output_folder"
        },
        "params": {}
    },
    "dataset": {
        "source_type": "beir",
        "path": "scifact",
        "filter_corpus": true
    },
    "data_path": "data",
    "rag": {
        "rag_type": "naive",
        "params": {
            "EMBEDDING_MODEL": "all-MiniLM-L6-v2"
        }
    },
    "steps": {
        "generation": false,
        "upload_hf": false,
        "load_dataset": true,
        "evaluation": true,
        "metrics": true,
        "visualization": true
    }
}

Generate Data

Generators each have specific configuration parameters. They usually take a folder as input and produce a folder as output with the jsonl dataset. Here is an exemple with a generator that does not require an LLM backend :

from forcolate import convert_URLS_to_markdown

query = "Download and convert https://fr.wikipedia.org/wiki/Grand_mod%C3%A8le_de_langage and https://fr.wikipedia.org/wiki/Ascenseur_spatial"

convert_URLS_to_markdown(query, "", "data/wikipedia")

from ragformance.generators.structural_generator import StructuralGenerator, StructuralGeneratorConfig
config = StructuralGeneratorConfig(
    data_path = "data/wikipedia",
    output_path = "data/wikipedia_questions")

corpus, queries = StructuralGenerator().run(config)

This can also be run in a full pipeline (cli or library) with the following config file and python command:

{
    "generation": {
        "type": "structural_generator",
        "source": {
            "path": "data/wikipedia"
        },
        "output": {
            "path": "data/wikipedia_questions"
        },
        "params": {}
    },
    "data_path": "data",
    "steps": {
        "generation": true
    }
}

from forcolate import convert_URLS_to_markdown

query = "Download and convert https://fr.wikipedia.org/wiki/Grand_mod%C3%A8le_de_langage and https://fr.wikipedia.org/wiki/Ascenseur_spatial"

convert_URLS_to_markdown(query, "", "data/wikipedia")

from ragformance.cli.run import run_pipeline

corpus, queries, answers, metrics_data, display_widget = run_pipeline("config.json")

For detailed information on available generators, their specific parameters, and advanced configuration, please refer to the Generators Documentation.

Dataset Structure

The dataset consists of two files:

corpus.jsonl: A jsonl file containing the corpus of documents. Each document is represented as a json object with the following fields:
- _id: The id of the document.
- title: The title of the document.
- text: The text of the document.
queries.jsonl: A jsonl file containing the queries. Each query is represented as a json object with the following fields:
- _id: The id of the query.
- query_text: The text of the query.
- relevant_document_ids: A list of references to the documents in the corpus. Each reference is represented as a json object with the following fields:
  - corpus_id: The id of the document.
  - score: The score of the reference.
- ref_answer: The reference answer for the query.
- metadata: A dictionary containing the metadata for the query.

This structure is inspired by the popular BEIR format, with the inclusion of the qrelsfile inside the queries : indeed, BEIR is optimized for Information Retrieval tasks whereas this library aims also to evaluates other tasks (such as end to end generation).

Answer Output Format

The answers generated by the system are structured as a json lines, with each line corresponding to a processed question. Each entry contains:

query: A dictionary describing the original question, with:
- _id: Unique identifier for the question.
- query_text: The question text.
- relevant_document_ids: A list of corpus documents considered as references for this question, each reference containing:
  - corpus_id: The document identifier.
  - score: The importance or relevance score.
- ref_answer: The reference (gold standard) answer for the question.
model_answer: The generated answer
relevant_documents_ids: A list of corpus document IDs used as context for generating the answer.
retrieved_documents_distances: A list of relevancy scores for the retrieved documents.

It is based on the following pydantic model

from typing import Dict, List, Optional
from pydantic import BaseModel, Field


class RelevantDocumentModel(BaseModel):
    corpus_id: str
    score: int

class AnnotatedQueryModel(BaseModel):
    id: str = Field(alias="_id")
    query_text: str

    relevant_document_ids: List[RelevantDocumentModel]
    ref_answer: str

    metadata: Optional[Dict] = None

class AnswerModel(BaseModel):
    id: str = Field(alias="_id")

    query: AnnotatedQueryModel

    # model output
    model_answer: str
    retrieved_documents_ids: List[str]
    retrieved_documents_distances: Optional[List[float]] = None

Loading a dataset from jsonl

from typing import List

from ragformance.models.corpus import DocModel
from ragformance.rag.naive_rag import NaiveRag
from pydantic import TypeAdapter

ta = TypeAdapter(List[DocModel])

# load from jsonl file
with open("output/corpus.jsonl","r") as f:
    corpus= ta.validate_python([json.loads(line) for line in f])

naive_rag = NaiveRag()
naive_rag.upload_corpus(corpus=corpus)

Additionnal features

To keep the core library lightweight, but still allow multiple integrations, all the features below are packaged optionnaly in the library. You must install them with a specific command or generally :

pip install ragformance[all]

Loading a dataset from Hugging face

You can use directly datasets with the correct format that are hosted on Hugging Face. First install optionnal dependencies with :

pip install ragformance[huggingface]

from typing import List

from ragformance.models.corpus import DocModel
from ragformance.models.answer import AnnotatedQueryModel
from ragformance.rag.naive_rag import NaiveRag
from pydantic import TypeAdapter

from datasets import load_dataset
ta = TypeAdapter(List[DocModel])
taq = TypeAdapter(List[AnnotatedQueryModel])

corpus= ta.validate_python(load_dataset("FOR-sight-ai/ragformance_toloxa", "corpus", split="train"))
queries = taq.validate_python(load_dataset("FOR-sight-ai/ragformance_toloxa", "queries", split="train"))

naive_rag = NaiveRag()
doc_uploaded_num = naive_rag.upload_corpus(corpus=corpus)
answers = naive_rag.ask_queries(queries)

Pushing dataset to Hugging Face Hub

This function pushes the two jsonl files to a Hugging Face Hub dataset repository; you must set the environment variable HF_TOKEN, either in system environment or config.json

from ragformance.dataloaders import push_to_hub
HFpath = "YOUR_NAME/YOUR_PATH"
push_to_hub(HFpath, "output")

Trec-Eval Metrics and visualization

This library wraps the trec eval tools for Information Retrieval metrics. Make sure to install the optional dependency with:

pip install ragformance[trec-eval]

It provides also a set metrics visualization to help assess if the test dataset is well balanced and if a solution under test has the expected performances.

from ragformance.eval import trec_eval_metrics
from ragformance.eval import visualize_semantic_F1, display_semantic_quadrants

metrics = trec_eval_metrics(answers)

quadrants = visualize_semantic_F1(corpus, answers)

display_semantic_quadrants(quadrants)

Using DeepEval metrics

You can also use DeepEval metrics for LLM-based evaluation. Make sure to install the optional dependency with:

pip install ragformance[deepeval]

Example usage:

from ragformance.eval import compute_deepeval_metrics

additional_metrics = {
    "FaithfulnessMetric": True
}
metric = compute_deepeval_metrics(
    corpus,
    answers,
    llm_api_key="your API key",
    additional_metrics=additional_metrics
)
print(metric)

Tracing

To enable tracing with Arize Phoenix, you need to install the optional Phoenix dependencies:

pip install ragformance[phoenix]

Then add the following section to your config.json file (all parameters are optional except enable; defaults are shown):

{
  "phoenix": {
    "enable": true,
    "endpoint": "http://localhost:6006",  // Phoenix server address (optional)
    "project_name": "ragformance"         // Project name for tracing (optional)
  },
  ...
}

When enabled, RAGformance will automatically instrument the generation pipelines and send traces to the Phoenix server specified in endpoint (default: http://localhost:6006). You can start the Phoenix UI with:

phoenix serve

Then open http://localhost:6006/ (or your custom endpoint) in your browser to view traces and metrics.

For more details, see the Phoenix documentation.

Acknowledgement

This project received funding from the French ”IA Cluster” program within the Artificial and Natural Intelligence Toulouse Institute (ANITI) and from the "France 2030" program within IRT Saint Exupery. The authors gratefully acknowledge the support of the FOR projects.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sprauej

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.105

Oct 2, 2025

0.1.100

Jun 5, 2025

0.1.99

Jun 5, 2025

0.1.98

Jun 4, 2025

0.1.97

Jun 4, 2025

0.1.96

Jun 4, 2025

0.1.95

Jun 4, 2025

0.1.94

May 28, 2025

0.1.93

May 28, 2025

0.1.92

May 28, 2025

0.1.91

May 28, 2025

0.1.90

May 27, 2025

0.1.89

May 27, 2025

0.1.88

May 26, 2025

0.1.87

May 26, 2025

0.1.86

May 26, 2025

0.1.85

May 26, 2025

0.1.84

May 26, 2025

0.1.83

May 25, 2025

0.1.82

May 25, 2025

0.1.81

May 25, 2025

0.1.80

May 25, 2025

0.1.79

May 24, 2025

0.1.78

May 23, 2025

0.1.77

May 22, 2025

0.1.76

May 20, 2025

0.1.75

May 19, 2025

0.1.73

May 19, 2025

0.1.72

May 19, 2025

0.1.71

May 19, 2025

0.1.70

May 19, 2025

0.1.69

May 19, 2025

0.1.68

May 19, 2025

0.1.67

May 19, 2025

0.1.66

May 16, 2025

0.1.65

May 16, 2025

0.1.64

May 16, 2025

0.1.63

May 16, 2025

0.1.62

May 16, 2025

0.1.61

May 16, 2025

0.1.60

May 16, 2025

0.1.59

May 16, 2025

0.1.58

May 16, 2025

0.1.57

May 16, 2025

0.1.56

May 16, 2025

0.1.55

May 16, 2025

0.1.54

May 16, 2025

0.1.53

May 16, 2025

0.1.52

May 16, 2025

0.1.51

May 16, 2025

0.1.50

May 16, 2025

0.1.49

May 16, 2025

0.1.48

May 16, 2025

0.1.47

May 16, 2025

0.1.46

May 16, 2025

0.1.45

May 16, 2025

0.1.44

May 16, 2025

0.1.43

May 16, 2025

0.1.42

May 16, 2025

0.1.41

May 16, 2025

0.1.40

May 16, 2025

0.1.39

May 16, 2025

0.1.38

May 16, 2025

0.1.37

May 15, 2025

0.1.36

May 15, 2025

0.1.35

May 15, 2025

0.1.34

May 15, 2025

0.1.33

May 15, 2025

0.1.32

May 15, 2025

0.1.29

May 15, 2025

0.1.28

May 15, 2025

0.1.27

May 15, 2025

0.1.26

May 15, 2025

0.1.25

May 15, 2025

0.1.24

May 15, 2025

0.1.23

May 15, 2025

0.1.22

May 15, 2025

0.1.21

May 15, 2025

0.1.20

May 15, 2025

0.1.19

May 15, 2025

0.1.18

May 15, 2025

0.1.17

May 15, 2025

0.1.16

May 15, 2025

0.1.15

May 15, 2025

0.1.14

May 15, 2025

0.1.13

May 15, 2025

0.1.12

May 15, 2025

0.1.11

May 14, 2025

0.1.10

May 14, 2025

0.1.9

May 14, 2025

0.1.8

May 14, 2025

0.1.7

May 14, 2025

0.1.6

May 14, 2025

0.1.5

May 13, 2025

0.1.4

May 13, 2025

0.1.3

May 13, 2025

0.1.2

May 13, 2025

0.1.1

May 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragformance-0.1.105.tar.gz (104.5 kB view details)

Uploaded Oct 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragformance-0.1.105-py3-none-any.whl (131.1 kB view details)

Uploaded Oct 2, 2025 Python 3

File details

Details for the file ragformance-0.1.105.tar.gz.

File metadata

Download URL: ragformance-0.1.105.tar.gz
Upload date: Oct 2, 2025
Size: 104.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragformance-0.1.105.tar.gz
Algorithm	Hash digest
SHA256	`8d0341af4f2fbafaf9803b3ad6090acaa9a7d40be5133d8c57225d168ea0bef3`
MD5	`c0c2289a53158d7958e651791551bd77`
BLAKE2b-256	`20d6e643df7ddd950222ac897ce953f769197c5e36b8a9adab4b42dc56942e75`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragformance-0.1.105.tar.gz:

Publisher: publish.yml on FOR-sight-ai/RAGFORmance

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragformance-0.1.105.tar.gz
- Subject digest: 8d0341af4f2fbafaf9803b3ad6090acaa9a7d40be5133d8c57225d168ea0bef3
- Sigstore transparency entry: 578782642
- Sigstore integration time: Oct 2, 2025
Source repository:
- Permalink: FOR-sight-ai/RAGFORmance@06eed104fdca55d62845dd6258329aff8e8fd4e0
- Branch / Tag: refs/heads/main
- Owner: https://github.com/FOR-sight-ai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@06eed104fdca55d62845dd6258329aff8e8fd4e0
- Trigger Event: push

File details

Details for the file ragformance-0.1.105-py3-none-any.whl.

File metadata

Download URL: ragformance-0.1.105-py3-none-any.whl
Upload date: Oct 2, 2025
Size: 131.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ragformance-0.1.105-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1d300e4626a70e4fdf8667a5594b7df101f4fd60b6248461b9b60d9923038de4`
MD5	`04b6f81f89a92f572cb2304fffa5ef93`
BLAKE2b-256	`80d36eb8817938b84873e128874fd15df5a7c9fd8cc5463f3f340e4a51ae4b57`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ragformance-0.1.105-py3-none-any.whl:

Publisher: publish.yml on FOR-sight-ai/RAGFORmance

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ragformance-0.1.105-py3-none-any.whl
- Subject digest: 1d300e4626a70e4fdf8667a5594b7df101f4fd60b6248461b9b60d9923038de4
- Sigstore transparency entry: 578782643
- Sigstore integration time: Oct 2, 2025
Source repository:
- Permalink: FOR-sight-ai/RAGFORmance@06eed104fdca55d62845dd6258329aff8e8fd4e0
- Branch / Tag: refs/heads/main
- Owner: https://github.com/FOR-sight-ai
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@06eed104fdca55d62845dd6258329aff8e8fd4e0
- Trigger Event: push

ragformance 0.1.105

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Installation

Usage

Usage as a library

Usage as a CLI or python pipeline

Configuring the pipeline

Generate Data

Dataset Structure

Answer Output Format

Loading a dataset from jsonl

Additionnal features

Loading a dataset from Hugging face

Pushing dataset to Hugging Face Hub

Trec-Eval Metrics and visualization

Using DeepEval metrics

Tracing

Acknowledgement

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance