TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.

Project description

TruthTorchLM: A Comprehensive Library for Hallucination Detection in LLMs

TruthTorchLM is an open-source library designed to detect and mitigate hallucinations in text generation models. The library integrates state-of-the-art methods, offers comprehensive benchmarking tools across various tasks, and enables seamless integration with popular frameworks like Huggingface and LiteLLM.

Features

State-of-the-Art Methods: Implementations of advanced hallucination detection techniques.
Evaluation Tools: Benchmark hallucination detection methods using various metrics like AUROC, PRR, and Accuracy.
Calibration: Normalize and calibrate truth values for interpretable and comparable hallucination scores.
Integration: Seamlessly works with Huggingface and LiteLLM.
Long-Form Generation: Adapts detection methods to handle long-form text generations effectively.
Extendability: Provides an intuitive interface for implementing new hallucination detection methods.

Installation

Install TruthTorchLM using pip:

pip install TruthTorchLM

Quick Start

Setting Up a Model

You can define your model and tokenizer using Huggingface or specify an API-based model:

from transformers import AutoModelForCausalLM, AutoTokenizer
import TruthTorchLM as ttlm
import torch

# Huggingface model
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-chat-hf", 
    torch_dtype=torch.bfloat16
).to('cuda:0')
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf", use_fast=False)

# API model
api_model = "gpt-4o"

Generating Text with Truth Values

TruthTorchLM generates messages with a truth value, indicating whether the model output is hallucinating or not. Various methods (called truth methods) can be used to detect hallucinations. Each method can have different algorithms and output ranges. Lower truth values generally suggest hallucinations. This functionality is mostly useful for short-form QA:

# Define truth methods
lars = ttlm.truth_methods.LARS()
confidence = ttlm.truth_methods.Confidence()
self_detection = ttlm.truth_methods.SelfDetection(number_of_questions=5)
truth_methods = [lars, confidence, self_detection]

# Define a chat history
chat = [{"role": "system", "content": "You are a helpful assistant. Give short and precise answers."},
        {"role": "user", "content": "What is the capital city of France?"}]

# Generate text with truth values (Huggingface model)
output_hf_model = ttlm.generate_with_truth_value(
    model=model,
    tokenizer=tokenizer,
    messages=chat,
    truth_methods=truth_methods,
    max_new_tokens=100,
    temperature=0.7
)
# Generate text with truth values (API model)
output_api_model = ttlm.generate_with_truth_value(
    model=api_model,
    messages=chat,
    truth_methods=truth_methods
)

Calibrating Truth Methods

Truth values for different methods may not be directly comparable. Use the calibrate_truth_method function to normalize truth values to a common range for better interpretability

model_judge = ttlm.evaluators.ModelJudge('gpt-4o-mini')
calibration_results = ttlm.calibrate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    tokenizer=tokenizer,
    correctness_evaluator=model_judge,
    size_of_data=1000,
    max_new_tokens=64
)

Evaluating Truth Methods

We can evaluate the truth methods with the evaluate_truth_method function. We can define different evaluation metrics including AUROC, AUPRC, AUARC, Accuracy, F1, Precision, Recall, PRR:

results = ttlm.evaluate_truth_method(
    dataset='trivia_qa',
    model=model,
    truth_methods=truth_methods,
    eval_metrics=['auroc', 'prr'],
    tokenizer=tokenizer,
    size_of_data=1000,
    correctness_evaluator=model_judge,
    max_new_tokens=64
)

Available Hallucination Detection Methods

LARS: Do Not Design, Learn: A Trainable Scoring Function for Uncertainty Estimation in Generative LLMs.
Confidence: Uncertainty Estimation in Autoregressive Structured Prediction.
Entropy:Uncertainty Estimation in Autoregressive Structured Prediction.
SelfDetection: Knowing What LLMs DO NOT Know: A Simple Yet Effective Self-Detection Method.
AttentionScore: LLM-Check: Investigating Detection of Hallucinations in Large Language Models.
CrossExamination: LM vs LM: Detecting Factual Errors via Cross Examination.
EccentricityConfidence: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
EccentricityUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
GoogleSearchCheck: FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios.
Inside: INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection.
KernelLanguageEntropy: Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities.
MARS: MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs.
MatrixDegreeConfidence: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
MatrixDegreeUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
MultiLLMCollab: Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration.
NumSemanticSetUncertainty: Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation.
PTrue: Language Models (Mostly) Know What They Know.
Saplma: The Internal State of an LLM Knows When It’s Lying.
SemanticEntropy: Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation.
sentSAR: Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models.
SumEigenUncertainty: Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models.
tokenSAR: Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models.
VerbalizedConfidence: Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback.

Contributors

Yavuz Faruk Bakman (ybakman@usc.edu)
Duygu Nur Yaldiz (yaldiz@usc.edu)
Sungmin Kang (kangsung@usc.edu)
Hayrettin Eren Yildiz (hayereyil@gmail.com)
Alperen Ozis (alperenozis@gmail.com)

Citation

If you use TruthTorchLM in your research, please cite:

@misc{truthtorchlm2025,
  title={TruthTorchLM: A Comprehensive Library for Hallucination Detection in Large Language Models},
  author={Yavuz Faruk Bakman, Duygu Nur Yaldiz,Sungmin Kang, Hayrettin Eren Yildiz, Alperen Ozis},
  year={2025},
  howpublished={GitHub},
  url={https://github.com/Ybakman/TruthTorchLM}
}

License

TruthTorchLM is released under the MIT License.

For inquiries or support, feel free to contact the maintainers.

Project details

Release history Release notifications | RSS feed

0.1.19

Feb 20, 2026

0.1.18

Feb 20, 2026

0.1.17

Mar 14, 2025

0.1.14

Feb 5, 2025

0.1.13

Feb 4, 2025

0.1.12

Feb 4, 2025

0.1.11

Feb 4, 2025

0.1.9

Feb 3, 2025

0.1.8

Jan 27, 2025

0.1.7

Jan 25, 2025

0.1.6

Jan 16, 2025

0.1.5

Jan 12, 2025

0.1.4

Jan 8, 2025

0.1.3

Jan 8, 2025

0.1.2

Jan 7, 2025

0.1.1

Jan 7, 2025

This version

0.1.0

Jan 7, 2025

0.0.0

Jan 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truthtorchlm-0.1.0.tar.gz (75.4 kB view details)

Uploaded Jan 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

TruthTorchLM-0.1.0-py3-none-any.whl (114.8 kB view details)

Uploaded Jan 7, 2025 Python 3

File details

Details for the file truthtorchlm-0.1.0.tar.gz.

File metadata

Download URL: truthtorchlm-0.1.0.tar.gz
Upload date: Jan 7, 2025
Size: 75.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for truthtorchlm-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5201ef67ea5cb7813bd5bca8fca41ddda3a41f85c50ceae267f8fc41f906c0ae`
MD5	`6c16a24225b4a58f9bdbe27f60851934`
BLAKE2b-256	`5298560fe38fa65e97f89bdbe33bd0693237e40c958eab6a142a63675158d7f3`

See more details on using hashes here.

File details

Details for the file TruthTorchLM-0.1.0-py3-none-any.whl.

File metadata

Download URL: TruthTorchLM-0.1.0-py3-none-any.whl
Upload date: Jan 7, 2025
Size: 114.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.10.16

File hashes

Hashes for TruthTorchLM-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`72f2a84c145eee7a4c2d8646d9ee9b5cd9605f3f8e760cc83cee4be1257fbcf8`
MD5	`8f2b2649571de13f658b0a71d847b84b`
BLAKE2b-256	`9286c79d358b4263948184fc4aa51bae268359480718122f41d386324453c1e2`

See more details on using hashes here.

TruthTorchLM 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

TruthTorchLM: A Comprehensive Library for Hallucination Detection in LLMs

Features

Installation

Quick Start

Setting Up a Model

Generating Text with Truth Values

Calibrating Truth Methods

Evaluating Truth Methods

Available Hallucination Detection Methods

Contributors

Citation

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes