Skip to main content

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.

Project description

TruthTorchLM: A Comprehensive Library for Assessing Truthfulness in LLM Outputs


Features

  • State-of-the-Art Methods: Offers 23 truth methods that are designed to assess the truthfulness of LLM generations. These methods range from Google Search to uncertainty estimation and multi-LLM collaboration techniques.
  • Integration: Fully compatible with Huggingface and LiteLLM, enabling users to incorporate truth evaluation into their workflows with minimal code changes.
  • Evaluation Tools: Benchmark truth methods using various metrics including AUROC, AUPRC, PRR, and Accuracy.
  • Calibration: Normalize and calibrate truth methods for interpretable and comparable outputs.
  • Long-Form Generation: Adapts truth methods to assess truthfulness in long-form text generations effectively.
  • Extendability: Provides an intuitive interface for implementing new truth methods.

Installation

Create a new environment with python >=3.10:

conda create --name truthtorchlm python=3.10
conda activate truthtorchlm

Then, install TruthTorchLM using pip:

pip install TruthTorchLM

Quick Start

Setting Up a Model

You can define your model and tokenizer using Huggingface or specify an API-based model:

from transformers import AutoModelForCausalLM, AutoTokenizer
import TruthTorchLM as ttlm
import torch

# Huggingface model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-chat-hf", 
    torch_dtype=torch.bfloat16
).to('cuda:0')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_fast=False)

# API model
api_model = "gpt-4o"

Generating Text with Truth Values

TruthTorchLM generates messages with a truth value, indicating whether the model output is truthful or not. Various methods (called truth methods) can be used for this purpose. Each method can have different algorithms and output ranges. Higher truth values generally suggest truthful outputs. This functionality is mostly useful for short-form QA:

# Define truth methods
lars = ttlm.truth_methods.LARS()
confidence = ttlm.truth_methods.Confidence()
self_detection = ttlm.truth_methods.SelfDetection(number_of_questions=5)
truth_methods = [lars, confidence, self_detection]
# Define a chat history
chat = [{"role": "system", "content": "You are a helpful assistant. Give short and precise answers."},
        {"role": "user", "content": "What is the capital city of France?"}]
# Generate text with truth values (Huggingface model)
output_hf_model = ttlm.generate_with_truth_value(
    model=model,
    tokenizer=tokenizer,
    messages=chat,
    truth_methods=truth_methods,
    max_new_tokens=100,
    temperature=0.7
)
# Generate text with truth values (API model)
output_api_model = ttlm.generate_with_truth_value(
    model=api_model,
    messages=chat,
    truth_methods=truth_methods
)

Calibrating Truth Methods

Truth values for different methods may not be directly comparable. Use the calibrate_truth_method function to normalize truth values to a common range for better interpretability

model_judge = ttlm.evaluators.ModelJudge('gpt-4o-mini')
calibration_results = ttlm.calibrate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    tokenizer=tokenizer,
    correctness_evaluator=model_judge,
    size_of_data=1000,
    max_new_tokens=64
)

Evaluating Truth Methods

We can evaluate the truth methods with the evaluate_truth_method function. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR:

results = ttlm.evaluate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    eval_metrics=['auroc', 'prr'],
    tokenizer=tokenizer,
    size_of_data=1000,
    correctness_evaluator=model_judge,
    max_new_tokens=64
)

Truthfulness in Long-Form Generation

Assigning a single truth value for a long text is neither practical nor useful. TruthTorchLM first decomposes the generated text into short, single-sentence statements and assigns truth values to these statements using statement check methods. The long_form_generation_with_truth_value function returns the generated text, decomposed statements, and their truth values.

import TruthTorchLM.long_form_generation as LFG
from transformers import DebertaForSequenceClassification, DebertaTokenizer

#define a decomposition method that breaks the the long text into statements
decomposition_method = LFG.decomposition_methods.StructuredDecompositionAPI(model="gpt-4o-mini", decomposition_depth=1, instruction=ttlm.LFG_DECOMPOSITION_PROMPT) #Utilize API models to decompose text
# decomposition_method = LFG.decomposition_methods.StructuredDecompositionLocal(model, tokenizer, decomposition_depth=1, chat_template=DECOMPOSITION_PROMT) #Utilize HF models to decompose text

#entailment model is used by some truth methods and statement check methods
model_for_entailment = DebertaForSequenceClassification.from_pretrained('microsoft/deberta-large-mnli').to('cuda:0')
tokenizer_for_entailment = DebertaTokenizer.from_pretrained('microsoft/deberta-large-mnli')
#define truth methods 
confidence = ttlm.truth_methods.Confidence()
lars = ttlm.truth_methods.LARS()

#define the statement check methods that applies truth methods
qa_generation = LFG.statement_check_methods.QuestionAnswerGeneration(model="gpt-4o-mini", tokenizer=None, num_questions=2, max_answer_trials=2,
                                                                     truth_methods=[confidence, lars], seed=0,
                                                                     instruction=ttlm.LFG_QUESTION_GENERATION_PROMPT, 
                                                                     first_statement_instruction=ttlm.LFG_QUESTION_GENERATION_PROMPT,
                                                                     entailment_model=model_for_entailment, entailment_tokenizer=tokenizer_for_entailment) #HF model and tokenizer can also be used, LM is used to generate question
#there are some statement check methods that are directly designed for this purpose, not utilizing truth methods
as_entailment = LFG.statement_check_methods.AnswerStatementEntailment( model="gpt-4o-mini", tokenizer=None, 
                                                                      num_questions=3, num_answers_per_question=2, 
                                                                      instruction=ttlm.LFG_QUESTION_GENERATION_PROMPT, 
                                                                      first_statement_instruction=ttlm.LFG_QUESTION_GENERATION_PROMPT,
                                                                      entailment_model=model_for_entailment, entailment_tokenizer=tokenizer_for_entailment) #HF model and tokenizer can also be used, LM is used to generate question
#define a chat history
chat = [{"role": "system", "content": 'You are a helpful assistant. Give brief and precise answers.'},
        {"role": "user", "content": f'Who is Ryan Reynolds?'}]

#generate a message with a truth value, it's a wrapper fucntion for model.generate in Huggingface
output_hf_model = LFG.long_form_generation_with_truth_value(model=model, tokenizer=tokenizer, messages=chat, fact_decomp_method=decomposition_method, 
                                          stmt_check_methods=[qa_generation, as_entailment], generation_seed=0)

#generate a message with a truth value, it's a wrapper fucntion for litellm.completion in litellm
output_api_model = LFG.long_form_generation_with_truth_value(model="gpt-4o-mini", messages=chat, fact_decomp_method=decomposition_method, 
                                          stmt_check_methods=[qa_generation, as_entailment], generation_seed=0, seed=0)

Evaluation of Truth Methods in Long-Form Generation

We can evaluate truth methods on long-form generation by using evaluate_truth_method_long_form function. To obtain the correctness of the statements we follow SAFE paper. SAFE performs Google search for each statement and assigns labels as supported, unsupported, or irrelevant. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR.

Note: Calibrating truth methods before running evaluation is recommended.

#create safe object that assigns labels to the statements
safe = LFG.ClaimEvaluator(rater='gpt-4o-mini', tokenizer = None, max_steps = 2, max_retries = 2, num_searches = 2)

#Define metrics
sample_level_eval_metrics = ['f1'] #calculate metric over the statements of a question, then average across all the questions
dataset_level_eval_metrics = ['auroc', 'prr'] #calculate the metric across all statements 
results = LFG.evaluate_truth_method_long_form(dataset='longfact_objects', model='gpt-4o-mini', tokenizer=None,
                                sample_level_eval_metrics=sample_level_eval_metrics, dataset_level_eval_metrics=dataset_level_eval_metrics,
                                fact_decomp_method=decomposition_method, stmt_check_methods=[qa_generation],
                                claim_evaluator = safe, size_of_data=3,  previous_context=[{'role': 'system', 'content': 'You are a helpful assistant. Give precise answers.'}], 
                                user_prompt="Question: {question_context}", seed=41,  return_method_details = False, return_calim_eval_details=False, wandb_run = None,  
                                add_generation_prompt = True, continue_final_message = False)

Available Truth Methods


Contributors


Citation

If you use TruthTorchLM in your research, please cite:

@misc{truthtorchlm2025,
  title={TruthTorchLM: A Comprehensive Library for Assessing Truthfulness in LLM Outputs},
  author={Yavuz Faruk Bakman, Duygu Nur Yaldiz,Sungmin Kang, Hayrettin Eren Yildiz, Alperen Ozis, Salman Avestimehr},
  year={2025},
  howpublished={GitHub},
  url={https://github.com/Ybakman/TruthTorchLM}
}

License

TruthTorchLM is released under the MIT License.

For inquiries or support, feel free to contact the maintainers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truthtorchlm-0.1.4.tar.gz (79.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TruthTorchLM-0.1.4-py3-none-any.whl (116.3 kB view details)

Uploaded Python 3

File details

Details for the file truthtorchlm-0.1.4.tar.gz.

File metadata

  • Download URL: truthtorchlm-0.1.4.tar.gz
  • Upload date:
  • Size: 79.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for truthtorchlm-0.1.4.tar.gz
Algorithm Hash digest
SHA256 fe7bdf6ef32ac456775636055f328a5d9c1faeb6b6e7a815e821356ab40d9bea
MD5 a03ec9e4a321c2a3d91a1a14049d111b
BLAKE2b-256 4f42fe3fca960c73cc2093285c4469e56012511349cc6cffc1b55660625c8c1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for truthtorchlm-0.1.4.tar.gz:

Publisher: python-publish.yml on Ybakman/TruthTorchLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file TruthTorchLM-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: TruthTorchLM-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 116.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for TruthTorchLM-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a397bd5cb8c76ac4f4ec075b6fb5aeef04998ecf933064b91291fa748cfe1304
MD5 80698c58d4af62d54a46f7e13976b45d
BLAKE2b-256 f340e1fb4b92449d8fef263d20a89d3fecc9f7b84b6e87c1ea046c57e61e03e2

See more details on using hashes here.

Provenance

The following attestation bundles were made for TruthTorchLM-0.1.4-py3-none-any.whl:

Publisher: python-publish.yml on Ybakman/TruthTorchLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page