Skip to main content

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.

Project description

TruthTorchLM: A Comprehensive Library for Hallucination Detection in LLMs

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.


Features

  • State-of-the-Art Methods: Implementations of advanced hallucination detection techniques.
  • Evaluation Tools: Benchmark hallucination detection methods using various metrics like AUROC, PRR, and Accuracy.
  • Calibration: Normalize and calibrate truth values for interpretable and comparable hallucination scores.
  • Integration: Seamlessly works with Huggingface and LiteLLM.
  • Long-Form Generation: Adapts detection methods to handle long-form text generations effectively.
  • Extendability: Provides an intuitive interface for implementing new hallucination detection methods.

Installation

Install TruthTorchLM using pip:

pip install TruthTorchLM

Quick Start

Setting Up a Model

You can define your model and tokenizer using Huggingface or specify an API-based model:

from transformers import AutoModelForCausalLM, AutoTokenizer
import TruthTorchLM as ttlm
import torch

# Huggingface model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-chat-hf", 
    torch_dtype=torch.bfloat16
).to('cuda:0')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_fast=False)

# API model
api_model = "gpt-4o"

Generating Text with Truth Values

TruthTorchLM generates messages with a truth value, indicating whether the model output is hallucinating or not. Various methods (called truth methods) can be used to detect hallucinations. Each method can have different algorithms and output ranges. Lower truth values generally suggest hallucinations. This functionality is mostly useful for short-form QA:

# Define truth methods
lars = ttlm.truth_methods.LARS()
confidence = ttlm.truth_methods.Confidence()
self_detection = ttlm.truth_methods.SelfDetection(number_of_questions=5)
truth_methods = [lars, confidence, self_detection]
# Define a chat history
chat = [{"role": "system", "content": "You are a helpful assistant. Give short and precise answers."},
        {"role": "user", "content": "What is the capital city of France?"}]
# Generate text with truth values (Huggingface model)
output_hf_model = ttlm.generate_with_truth_value(
    model=model,
    tokenizer=tokenizer,
    messages=chat,
    truth_methods=truth_methods,
    max_new_tokens=100,
    temperature=0.7
)
# Generate text with truth values (API model)
output_api_model = ttlm.generate_with_truth_value(
    model=api_model,
    messages=chat,
    truth_methods=truth_methods
)

Calibrating Truth Methods

Truth values for different methods may not be directly comparable. Use the calibrate_truth_method function to normalize truth values to a common range for better interpretability

model_judge = ttlm.evaluators.ModelJudge('gpt-4o-mini')
calibration_results = ttlm.calibrate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    tokenizer=tokenizer,
    correctness_evaluator=model_judge,
    size_of_data=1000,
    max_new_tokens=64
)

Evaluating Truth Methods

We can evaluate the truth methods with the evaluate_truth_method function. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR:

results = ttlm.evaluate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    eval_metrics=['auroc', 'prr'],
    tokenizer=tokenizer,
    size_of_data=1000,
    correctness_evaluator=model_judge,
    max_new_tokens=64
)

Available Hallucination Detection Methods


Contributors


Citation

If you use TruthTorchLM in your research, please cite:

@misc{truthtorchlm2025,
  title={TruthTorchLM: A Comprehensive Library for Hallucination Detection in Large Language Models},
  author={Yavuz Faruk Bakman, Duygu Nur Yaldiz,Sungmin Kang, Hayrettin Eren Yildiz, Alperen Ozis},
  year={2025},
  howpublished={GitHub},
  url={https://github.com/Ybakman/TruthTorchLM}
}

License

TruthTorchLM is released under the MIT License.

For inquiries or support, feel free to contact the maintainers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truthtorchlm-0.1.0.tar.gz (75.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TruthTorchLM-0.1.0-py3-none-any.whl (114.8 kB view details)

Uploaded Python 3

File details

Details for the file truthtorchlm-0.1.0.tar.gz.

File metadata

  • Download URL: truthtorchlm-0.1.0.tar.gz
  • Upload date:
  • Size: 75.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for truthtorchlm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5201ef67ea5cb7813bd5bca8fca41ddda3a41f85c50ceae267f8fc41f906c0ae
MD5 6c16a24225b4a58f9bdbe27f60851934
BLAKE2b-256 5298560fe38fa65e97f89bdbe33bd0693237e40c958eab6a142a63675158d7f3

See more details on using hashes here.

File details

Details for the file TruthTorchLM-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: TruthTorchLM-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 114.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for TruthTorchLM-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 72f2a84c145eee7a4c2d8646d9ee9b5cd9605f3f8e760cc83cee4be1257fbcf8
MD5 8f2b2649571de13f658b0a71d847b84b
BLAKE2b-256 9286c79d358b4263948184fc4aa51bae268359480718122f41d386324453c1e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page