Skip to main content

Library Hallucinations Adversarial Benchmark — evaluate LLM code generation for hallucinated libraries.

Project description

LHAB - Library Hallucinations Adversarial Benchmark

Evaluate LLM code generation for hallucinated (non-existent) libraries.

Part of the research paper Library Hallucinations in LLMs: Risk Analysis Grounded in Developer Queries.

Full dataset and leaderboard available on HuggingFace. Source code on GitHub.

install

pip install lhab

usage

The package exposes three functions:

  • lhab.load_dataset() — load the bundled benchmark dataset, returns a dictionary of splits (control, describe, specify), each containing a list of task records.

  • lhab.evaluate_responses(responses_file) — evaluate LLM responses against the benchmark, detecting hallucinated libraries. Saves results to a JSON file and returns a dictionary with statistics per split and type, plus all hallucinated library names.

  • lhab.download_pypi_data() — download the latest PyPI package list for ground truth validation. Called automatically on first evaluation if the data is not already present.

import lhab

dataset = lhab.load_dataset()
# {"control": [...], "describe": [...], "specify": [...]}

results = lhab.evaluate_responses("your_responses.jsonl")
# {"control": {...}, "describe": {...}, "specify": {...}, "hallucinations": {...}}

A CLI command is also available:

lhab-eval your_responses.jsonl

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lhab-0.3.tar.gz (327.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lhab-0.3-py3-none-any.whl (336.2 kB view details)

Uploaded Python 3

File details

Details for the file lhab-0.3.tar.gz.

File metadata

  • Download URL: lhab-0.3.tar.gz
  • Upload date:
  • Size: 327.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lhab-0.3.tar.gz
Algorithm Hash digest
SHA256 b130f92595ec984df0d31dc2f019a26485db8132f98a59ffbc5e4681ea7b4806
MD5 1edb630b2cc8db6cee22dd981b836ded
BLAKE2b-256 17c3953676fdd428137490f205c1bf2a109d7131af1c55d5d47ba421be214ff7

See more details on using hashes here.

File details

Details for the file lhab-0.3-py3-none-any.whl.

File metadata

  • Download URL: lhab-0.3-py3-none-any.whl
  • Upload date:
  • Size: 336.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lhab-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 01333bad37c0fd5f028cb121f4f0b774b79c4b322da6ca8f2fec0c8930aa82df
MD5 0f730aeb5eb04e061fa788361b01fd54
BLAKE2b-256 0bb4c7e5c5176673d51ff971e4fa4507d48de5cc1b72a21f4318796ed98d2ea5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page