Evaluate language models using multiple choice items
Project description
LM Pub Quiz
Evaluate language models using multiple choice items
This library implements a knoweledge probing approach which uses LM's inherent ability to estimate the log-likelihood of any given textual statement. For more information visit the LM Pub Quiz website.
See also
Getting started
This short guide should get you started. For more detailed information visit the documentation.
Installing the Package
You can install the package via pip:
pip install lm-pub-quiz
For alternatives methods of installing the package, visit the documentation.
Example Usage
from lm_pub_quiz import Dataset, Evaluator
# Load the dataset
dataset = Dataset.from_name("BEAR")
# Load the model
evaluator = Evaluator.from_model(
"gpt2",
model_type="CLM",
)
# Run the evaluation and save the
results = evaluator.evaluate_dataset(
dataset,
save_path="gpt2_results",
batch_size=32,
)
# If the results are analyzed in a different session, they can be loaded from the file system
# results = DatasetResults.from_path("gpt2_results")
print("=== Overall score ===")
print(results.get_metrics("accuracy"))
Contributing
We welcome any questions, comments, or even PRs to this project to improve the package.
We use hatch to manage this project. For the most comfortable development experience, please first install hatch using pip or pipx.
Then, to propose a change to the library,
- test your code locally using
hatch run all:test - format the code according to our formatting guidelines using
hatch run lint:fmt, - check type- and style-consistency using
hatch run lint:all, and - finally create a pull request describing the changes you propose.
For work on the documentation, use hatch run serve-docs to run a local documentation server.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lm_pub_quiz-0.3.2.tar.gz.
File metadata
- Download URL: lm_pub_quiz-0.3.2.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
902eae73f17f4362767a577f813d953635f7dcc56e9d614cfffcbd9f45b65909
|
|
| MD5 |
2d62099334f6074688aabdf77c422e3d
|
|
| BLAKE2b-256 |
a4d8ef0873c945a34ac88df61751b26ebdf2d2776fbe26fc735bc4cf5dff6290
|
Provenance
The following attestation bundles were made for lm_pub_quiz-0.3.2.tar.gz:
Publisher:
publish.yml on lm-pub-quiz/lm-pub-quiz
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lm_pub_quiz-0.3.2.tar.gz -
Subject digest:
902eae73f17f4362767a577f813d953635f7dcc56e9d614cfffcbd9f45b65909 - Sigstore transparency entry: 197962136
- Sigstore integration time:
-
Permalink:
lm-pub-quiz/lm-pub-quiz@16332d5d145b5ab62699d4fb44e9d8a0539b6102 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/lm-pub-quiz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@16332d5d145b5ab62699d4fb44e9d8a0539b6102 -
Trigger Event:
release
-
Statement type:
File details
Details for the file lm_pub_quiz-0.3.2-py3-none-any.whl.
File metadata
- Download URL: lm_pub_quiz-0.3.2-py3-none-any.whl
- Upload date:
- Size: 40.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9912e55dad594aec7035bbcf4e56fbf1ef37466d0b949ac3ce1c5c9c1470df4d
|
|
| MD5 |
25a733ed6e1c76f8cb2fc78e1ac26626
|
|
| BLAKE2b-256 |
66fcc55f49ba8448b5de5313a39afd6260094cdd828fcc4fbd36277fcb631e08
|
Provenance
The following attestation bundles were made for lm_pub_quiz-0.3.2-py3-none-any.whl:
Publisher:
publish.yml on lm-pub-quiz/lm-pub-quiz
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lm_pub_quiz-0.3.2-py3-none-any.whl -
Subject digest:
9912e55dad594aec7035bbcf4e56fbf1ef37466d0b949ac3ce1c5c9c1470df4d - Sigstore transparency entry: 197962141
- Sigstore integration time:
-
Permalink:
lm-pub-quiz/lm-pub-quiz@16332d5d145b5ab62699d4fb44e9d8a0539b6102 -
Branch / Tag:
refs/tags/v0.3.2 - Owner: https://github.com/lm-pub-quiz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@16332d5d145b5ab62699d4fb44e9d8a0539b6102 -
Trigger Event:
release
-
Statement type: