A fast, modular reimplementation of RAGAS's FactualCorrectness metric, supporting both open-weight and dedicated LLMs.

These details have not been verified by PyPI

Project description

TruthScore

``truthscore` is a fast, modular reimplementation of RAGAS's FactualCorrectness metric, supporting both open-weight and hosted LLMs. It evaluates factual consistency between a user response and a reference passage by breaking down answers into claims and verifying them using Natural Language Inference (NLI).

It is a component of the trutheval framework and is intended for scalable, cost-efficient factuality evaluation.

🔍 What it does

Claim Decomposition: The LLM-generated response is split into atomic factual claims using a lightweight LLM.
Entailment Scoring: Each claim is passed to an NLI model with the reference passage as context.
Final Score: The score reflects how many claims are entailed by the context, in the range [0.0, 1.0].

For more details, see FactualCorrectness.

✨ Key Features

🔁 RAGAS-compatible: Faithfully reimplements the FactualCorrectness metric logic from RAGAS
✅ Open-weight LLM support: Works with open-weight models (e.g., Gemma, LLaMA, Mistral via Ollama)
🧠 Plug-and-play: Swap in custom NLI models
⚙️ GPU-accelerated: Recommended for claim decomposition + NLI
🧪 Evaluated: Competitive benchmark results (see trutheval)

📦 Installation

For full open-weight support (LLM hosted with Ollama + CrossEncoders NLI):

pip install truthscore[open]

Otherwise, use the lightweight dependency pick the dependencies that best work for you with:

pip install truthscore

Regarding ollama installation, please check Ollama.

🚀 Quick Start

💡 Open-weight (fully local)

from langchain_community.llms import OllamaLLM
from ragas import SingleTurnSample
from ragas.llms import LangchainLLMWrapper

from truthscore import OpenFactualCorrectness

test_data = {
    "user_input": "What happened in Q3 2024?",
    "reference": "The company saw an 8% rise in Q3 2024, driven by strong marketing and product efforts.",
    "response": "The company experienced an 8% increase in Q3 2024 due to effective marketing strategies and product efforts."
}
sample = SingleTurnSample(**test_data)

evaluator_llm = LangchainLLMWrapper(OllamaLLM(model="gemma3:27b", base_url="http://localhost:11434"))
metric = OpenFactualCorrectness(llm=evaluator_llm)
score = metric.single_turn_score(sample)

print(score)  # e.g. 1.0

☁️ Hosted LLM (e.g., OpenAI)

from openai import OpenAI
from ragas import SingleTurnSample
from ragas.llms import LangchainLLMWrapper

from truthscore import OpenFactualCorrectness

evaluator_llm = LangchainLLMWrapper(OpenAI())
metric = OpenFactualCorrectness(llm=evaluator_llm)

# test_data same as above
score = metric.single_turn_score(SingleTurnSample(**test_data))

⚙️ Custom NLI Models

import torch
from langchain_community.llms import OllamaLLM
from ragas import SingleTurnSample
from ragas.llms import LangchainLLMWrapper
from sentence_transformers import CrossEncoder

from truthscore import OpenFactualCorrectness

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
nli_model = CrossEncoder("cross-encoder/nli-deberta-v3-large")
nli_model.model.to(device)

evaluator_llm = LangchainLLMWrapper(OllamaLLM(model="gemma3:27b", base_url="http://localhost:11434"))
metric = OpenFactualCorrectness(llm=evaluator_llm, nli_model=nli_model)

# test_data same as above
score = metric.single_turn_score(SingleTurnSample(**test_data))

📊 Background

This metric was evaluated across a 500-example benchmark using perturbation levels A0–A4 on top of the Google Natural Questions dataset using truthbench.

See full results in the trutheval project.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Jun 3, 2025

0.2.0

Jun 3, 2025

0.1.1

Jun 3, 2025

0.1.0

Jun 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truthscore-0.3.0.tar.gz (5.7 kB view details)

Uploaded Jun 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

truthscore-0.3.0-py3-none-any.whl (6.1 kB view details)

Uploaded Jun 3, 2025 Python 3

File details

Details for the file truthscore-0.3.0.tar.gz.

File metadata

Download URL: truthscore-0.3.0.tar.gz
Upload date: Jun 3, 2025
Size: 5.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.11.6 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for truthscore-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`8adcfdbb383d3f91ba5378a974e07cd7e62cb809b96e7e1c4e4736c0d16359ba`
MD5	`ff0118471bbe4f2b75c1569ef0b6d22c`
BLAKE2b-256	`960508a3c306dac33d93f5a0991c5639fb8a9a8fcd58056157934e8199eef8f8`

See more details on using hashes here.

File details

Details for the file truthscore-0.3.0-py3-none-any.whl.

File metadata

Download URL: truthscore-0.3.0-py3-none-any.whl
Upload date: Jun 3, 2025
Size: 6.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.11.6 Linux/5.15.167.4-microsoft-standard-WSL2

File hashes

Hashes for truthscore-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5d904353dbe6bea74678e2038ce299b75ae8d0a5e72a9096387f3695216bac71`
MD5	`17cce8366d82097a9fe5785200b8a88b`
BLAKE2b-256	`b73208b1fec022b84bd198269e9e5aaf24da85c7b614bf8707a183850bed204a`

See more details on using hashes here.

truthscore 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

TruthScore

🔍 What it does

✨ Key Features

📦 Installation

🚀 Quick Start

💡 Open-weight (fully local)

☁️ Hosted LLM (e.g., OpenAI)

⚙️ Custom NLI Models

📊 Background

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes