Standard datasets for dotevals LLM evaluations
Project description
doteval-datasets
Standard datasets for dotevals LLM evaluations.
Installation
pip install dotevals-datasets
Usage
Once installed, the datasets are automatically available in doteval:
from dotevals import foreach
@foreach.bfcl("simple")
def eval_bfcl(question: str, schema: list, answer: list):
# Your evaluation logic here
pass
@foreach.gsm8k("test")
def eval_gsm8k(question: str, reasoning: str, answer: str):
# Your evaluation logic here
pass
@foreach.humaneval()
def eval_humaneval(prompt: str, canonical_solution: str, test: str, entry_point: str):
# Your evaluation logic here
pass
@foreach.mmlu("test")
def eval_mmlu_all(question: str, subject: str, choices: list, answer: int):
# Your evaluation logic here
pass
@foreach.mmlu["college_mathematics"]("test")
def eval_mmlu_math(question: str, choices: list, answer: int):
# Your evaluation logic here
pass
@foreach.sroie("test")
def eval_sroie(image: Image, entities: dict):
# Your evaluation logic here
pass
Available Datasets
-
BFCL (Berkeley Function Calling Leaderboard): Tests function calling capabilities
- Variants:
simple,multiple,parallel - Columns:
question,schema,answer
- Variants:
-
GSM8K: Grade school math word problems
- Splits:
train,test - Columns:
question,reasoning,answer
- Splits:
-
HumanEval: Hand-written programming problems for code generation evaluation
- Columns:
prompt,canonical_solution,test,entry_point
- Columns:
-
MMLU: Massive Multitask Language Understanding across 57 academic subjects
- All subjects:
mmlu("test")- Columns:question,subject,choices,answer - Specific subject:
mmlu["college_mathematics"]("test")- Columns:question,choices,answer - Splits:
test,validation,dev
- All subjects:
-
SROIE: Scanned receipts OCR and information extraction
- Splits:
train,test - Columns:
image,address,company,date
- Splits:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dotevals_datasets-0.8.0.tar.gz.
File metadata
- Download URL: dotevals_datasets-0.8.0.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74b419258b82de9e1bf48aad2f7e2b8ac98e1fd753a1ee620d522d73cfe724f0
|
|
| MD5 |
173f8728f89c59c9330190956c020552
|
|
| BLAKE2b-256 |
83ff7a891484a0a95f18ce53278c554cc52eab4c85c6b103f7a8dcc6666f6c86
|
File details
Details for the file dotevals_datasets-0.8.0-py3-none-any.whl.
File metadata
- Download URL: dotevals_datasets-0.8.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e075a76d419a76d786f5b3d5b59a6d7e23a29a92e6ae01d6877dbeb8f712c33
|
|
| MD5 |
ef2347379a62ec495b3f516df829b438
|
|
| BLAKE2b-256 |
5c0bf0ebac6632e00223a1229a0c28894f6cc9b47acc2aeae4823871d3c82b5d
|