A utility library for dataset generation and clustering

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

HEDGE: Hallucination Estimation via Dense Geometric Entropy

HEDGE provides the code and Python package that accompany the paper "HEDGE: Hallucination Estimation via Dense Geometric Entropy for Medical VQA with Vision-Language Models." The library offers utilities for sampling answers from multimodal models, clustering them through logical and embedding-based strategies, and computing hallucination detection metrics across benchmarks such as VQA-RAD and KvasirVQA.

Installation

The utilities published in this repository are available on PyPI as hedge-bench.

pip install hedge-bench

You can also install the package from source by pip install git+https://github.com/SushantGautam/HEDGE.git.

Quickstart

The snippet below shows a minimal end-to-end example adapted from tmp_test.py, demonstrating how to generate answer samples, apply both embedding- and NLI-based clustering, and evaluate hallucination detection metrics.

from datasets import load_dataset
from transformers import pipeline

from hedge_bench.utils import (
    PROMPT_VARIANTS,
    add_hallucination_labels_vllm,
    apply_nli_clustering,
    compute_roc_aucs,
    generate_and_cache_dataset,
    generate_answers,
    optimize_and_apply_embed_clustering,
)

# 1) Prepare a small VQA-RAD subset
n_samples = 3
vqa_dict = [
    {"idx": i, "image": sample["image"], "question": sample["question"], "answer": sample["answer"]}
    for i, sample in enumerate(load_dataset("flaviagiammarino/vqa-rad", split="test"))
][:10]

generated = generate_and_cache_dataset(
    dataset_id="vqa_rad_test",
    num_samples=n_samples,
    vqa_dict=vqa_dict,
    force_regenerate=False,
    n_jobs=40,
)

# 2) Sample answers from a vision-language model
answers = generate_answers(
    generated,
    n_answers_high=n_samples,
    min_temp=0.1,
    max_temp=1.0,
    prompt_variants=PROMPT_VARIANTS,
    model="Qwen/Qwen2.5-VL-7B-Instruct",
)

# 3) Label hallucinations using a VLM judge and cluster by embeddings
answers = add_hallucination_labels_vllm(answers)
answers_embed, threshold, _ = optimize_and_apply_embed_clustering(answers)

# 4) Optionally, also try clustering with an NLI model and compute ROC AUCs
nli = pipeline("text-classification", model="microsoft/deberta-large-mnli", top_k=None, truncation=True)
answers_clustered = apply_nli_clustering(answers_embed, nli, batch_size=768)

aucs = compute_roc_aucs(answers_clustered)
print(f"Embedding clustering optimal threshold = {threshold:.3f}")
print(aucs)

Project layout

hedge_bench/algorithms.py – reference implementations of uncertainty estimators, clustering strategies, and scoring utilities.
hedge_bench/utils.py – high-level helper functions for dataset caching, answer generation, labeling, and evaluation (as used in the quickstart example).

Citation

If you use HEDGE in your work, please cite the associated paper.

HEDGE: Hallucination Estimation via Dense Geometric Entropy for Medical VQA with Vision-Language Models

License

This project is released under the MIT License. See LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sushantgautam

Release history Release notifications | RSS feed

This version

0.1.2

Oct 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hedge_bench-0.1.2.tar.gz (14.9 kB view details)

Uploaded Oct 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hedge_bench-0.1.2-py3-none-any.whl (14.2 kB view details)

Uploaded Oct 28, 2025 Python 3

File details

Details for the file hedge_bench-0.1.2.tar.gz.

File metadata

Download URL: hedge_bench-0.1.2.tar.gz
Upload date: Oct 28, 2025
Size: 14.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hedge_bench-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`3ba73614b474bd58a4ebcf4bb47c51222f36fa5a0b6ac10f790aa86ed3f95bb6`
MD5	`126e81bf30d136359c7e1ebaf402d2f6`
BLAKE2b-256	`3649b80a16a803e996a3d417488ec01c86838198417c352c027fc137583071aa`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hedge_bench-0.1.2.tar.gz:

Publisher: publish.yml on SushantGautam/HEDGE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hedge_bench-0.1.2.tar.gz
- Subject digest: 3ba73614b474bd58a4ebcf4bb47c51222f36fa5a0b6ac10f790aa86ed3f95bb6
- Sigstore transparency entry: 648461461
- Sigstore integration time: Oct 28, 2025
Source repository:
- Permalink: SushantGautam/HEDGE@e964f1082ca105b83682c538b3e32126f3b57c6c
- Branch / Tag: refs/tags/0.1.2.0
- Owner: https://github.com/SushantGautam
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e964f1082ca105b83682c538b3e32126f3b57c6c
- Trigger Event: release

File details

Details for the file hedge_bench-0.1.2-py3-none-any.whl.

File metadata

Download URL: hedge_bench-0.1.2-py3-none-any.whl
Upload date: Oct 28, 2025
Size: 14.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hedge_bench-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5d745ccf7f605093c4ce32546779c01df50b281e3899f37b92635dab8c6e4304`
MD5	`1884c3d8062f8c86a7038fca139d2361`
BLAKE2b-256	`6fca818bd4eab500a13501b3bc376736e86e3b817463672cc7347c1216526021`

See more details on using hashes here.

Provenance

The following attestation bundles were made for hedge_bench-0.1.2-py3-none-any.whl:

Publisher: publish.yml on SushantGautam/HEDGE

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: hedge_bench-0.1.2-py3-none-any.whl
- Subject digest: 5d745ccf7f605093c4ce32546779c01df50b281e3899f37b92635dab8c6e4304
- Sigstore transparency entry: 648461478
- Sigstore integration time: Oct 28, 2025
Source repository:
- Permalink: SushantGautam/HEDGE@e964f1082ca105b83682c538b3e32126f3b57c6c
- Branch / Tag: refs/tags/0.1.2.0
- Owner: https://github.com/SushantGautam
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e964f1082ca105b83682c538b3e32126f3b57c6c
- Trigger Event: release

hedge-bench 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

HEDGE: Hallucination Estimation via Dense Geometric Entropy

Installation

Quickstart

Project layout

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance