Skip to main content

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.

Project description

TruthTorchLM: A Comprehensive Library for Hallucination Detection in LLMs

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.


Features

  • State-of-the-Art Methods: Implementations of advanced hallucination detection techniques.
  • Evaluation Tools: Benchmark hallucination detection methods using various metrics like AUROC, PRR, and Accuracy.
  • Calibration: Normalize and calibrate truth values for interpretable and comparable hallucination scores.
  • Integration: Seamlessly works with Huggingface and LiteLLM.
  • Long-Form Generation: Adapts detection methods to handle long-form text generations effectively.
  • Extendability: Provides an intuitive interface for implementing new hallucination detection methods.

Installation

Install TruthTorchLM using pip:

pip install TruthTorchLM

Quick Start

Setting Up a Model

You can define your model and tokenizer using Huggingface or specify an API-based model:

from transformers import AutoModelForCausalLM, AutoTokenizer
import TruthTorchLM as ttlm
import torch

# Huggingface model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-chat-hf", 
    torch_dtype=torch.bfloat16
).to('cuda:0')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_fast=False)

# API model
api_model = "gpt-4o"

Generating Text with Truth Values

TruthTorchLM generates messages with a truth value, indicating whether the model output is hallucinating or not. Various methods (called truth methods) can be used to detect hallucinations. Each method can have different algorithms and output ranges. Lower truth values generally suggest hallucinations. This functionality is mostly useful for short-form QA:

# Define truth methods
lars = ttlm.truth_methods.LARS()
confidence = ttlm.truth_methods.Confidence()
self_detection = ttlm.truth_methods.SelfDetection(number_of_questions=5)
truth_methods = [lars, confidence, self_detection]
# Define a chat history
chat = [{"role": "system", "content": "You are a helpful assistant. Give short and precise answers."},
        {"role": "user", "content": "What is the capital city of France?"}]
# Generate text with truth values (Huggingface model)
output_hf_model = ttlm.generate_with_truth_value(
    model=model,
    tokenizer=tokenizer,
    messages=chat,
    truth_methods=truth_methods,
    max_new_tokens=100,
    temperature=0.7
)
# Generate text with truth values (API model)
output_api_model = ttlm.generate_with_truth_value(
    model=api_model,
    messages=chat,
    truth_methods=truth_methods
)

Calibrating Truth Methods

Truth values for different methods may not be directly comparable. Use the calibrate_truth_method function to normalize truth values to a common range for better interpretability

model_judge = ttlm.evaluators.ModelJudge('gpt-4o-mini')
calibration_results = ttlm.calibrate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    tokenizer=tokenizer,
    correctness_evaluator=model_judge,
    size_of_data=1000,
    max_new_tokens=64
)

Evaluating Truth Methods

We can evaluate the truth methods with the evaluate_truth_method function. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR:

results = ttlm.evaluate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    eval_metrics=['auroc', 'prr'],
    tokenizer=tokenizer,
    size_of_data=1000,
    correctness_evaluator=model_judge,
    max_new_tokens=64
)

Available Hallucination Detection Methods


Contributors


Citation

If you use TruthTorchLM in your research, please cite:

@misc{truthtorchlm2025,
  title={TruthTorchLM: A Comprehensive Library for Hallucination Detection in Large Language Models},
  author={Yavuz Faruk Bakman, Duygu Nur Yaldiz,Sungmin Kang, Hayrettin Eren Yildiz, Alperen Ozis},
  year={2025},
  howpublished={GitHub},
  url={https://github.com/Ybakman/TruthTorchLM}
}

License

TruthTorchLM is released under the MIT License.

For inquiries or support, feel free to contact the maintainers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truthtorchlm-0.1.2.tar.gz (75.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

TruthTorchLM-0.1.2-py3-none-any.whl (114.0 kB view details)

Uploaded Python 3

File details

Details for the file truthtorchlm-0.1.2.tar.gz.

File metadata

  • Download URL: truthtorchlm-0.1.2.tar.gz
  • Upload date:
  • Size: 75.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for truthtorchlm-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1a964ecc6b2c12ad9e09ff05fd66404f442f438eed639edb5e1a11073a1ce17d
MD5 94beff871d1c7dddb71fc0d7da5c8951
BLAKE2b-256 7c8a222f9944b86ed5ff4abc3edb1e2218276beb959eae11e0aea975a113b9fa

See more details on using hashes here.

File details

Details for the file TruthTorchLM-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: TruthTorchLM-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 114.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for TruthTorchLM-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b83a6f36caa9294cbaa33c81de0b89e27d04bbb485047ce071f03fe70ebca76d
MD5 4fbcb5c74f2c5b16ff2a5933aa351bbd
BLAKE2b-256 794ba8b2ea5bef1221085fecb53a141ce7944bdd45d805dd4e2920e2a79f8892

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page