CLI for the GELATO Dataset for Legislative NER
Project description
The GELATO Dataset for Legislative NER
This repository contains the code, data, and scores for The Gelato Dataset for Legislative NER (LREC 2026).
Original Paper
The preprint of the original paper is available on arXiv:
The GELATO Dataset for Legislative NER
CLI
The core of the project is a CLI to make it easy to run experiments on the GELATO dataset.
Installation
This project uses uv to manage the environment and internal dependencies.
With uv installed, run uv sync in the project root to create a .venv managed
by uv. Then, run:
uv run gelato --help
to see commands.
Optionally, install the CLI as a tool on your $PATH via:
uv tool install .
and simply run
gelato --help
from anywhere to access the CLI.
Commands
The CLI has a variety of commands to facilitate working with gelato.
For help, run
uv run gelato --help
Usage: gelato [OPTIONS] COMMAND [ARGS]...
Options:
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to copy it or
customize the installation.
--help Show this message and exit.
Commands:
prompt-optimize Use DSPy to optimize level two type prompts for a level one type
predict Load a DSPy-optimized program to predict level two labels
from CoNLL-formatted level one predictions
fine-tune Fine-tune a HuggingFace Transformer using `wandb`
train-model Train the desired model with the provided parameters
score Score a model on the datset at the provided path
align Align predictions with tokens if the tokenizer aggregation
pipeline fails. Applies first label wins strategy for
aggregation of text and labels. Useful as non-word-based
tokenizers sometimes struggle to rebuild and aggregate certain
words.
confusion Generate confusion matrices from CoNLL-formatted predictions
and their reference counterpart
prompt-optimize
The prompt-optimize command simplifies using DSPy to optimize level two type
prompts for each level one type prediction.
uv run gelato prompt-optimize --help
Usage: gelato prompt-optimize [OPTIONS] TRAIN_PATH DEV_PATH MODEL
Use DSPy to optimize level two type prompts for a level one type
Arguments:
TRAIN_PATH Path to CoNLL-formatted train dataset [required]
DEV_PATH Path to CoNLL-formatted test dataset [required]
MODEL LLM to prompt as a HuggingFace ID e.g. 'Qwen/Qwen3-32B'
[required]
Options:
--level-one-type [Abstraction|Act|Class|Document|Organization|Person]
Level one type to fine-tune a prompt for its
level two types [required]
--module [ChainOfThought|Predict]
What dspy.Module to use [required]
--optimizer [BetterTogether|BootstrapFewShot|BootstrapFewShotWithRandomSearch|
BootstrapFinetune|BootstrapRS|COPRO|Ensemble|InferRules|
KNNFewShot|LabeledFewShot|MIPROv2|SIMBA]
What dspy.Optimizer [required]
--window INTEGER The left-right context window to provide the
LLM for each mention [default: 50]
--base-url TEXT URL endpoint for an OpenAI-compatible LLM
chat server e.g. 'http://localhost:8000/v1'
[default: http://localhost:8000/v1]
--api-key TEXT API key for OpenAI LLM endpoint. Defaults to
'LOCAL' for self-hosted models that do not
require authentication. [default: LOCAL]
--k INTEGER 'k' to use when generating kNN if
'KNNFewShot' is the Optimizer [default: 10]
--help Show this message and exit.
predict
Load a DSPy-optimized program to predict level two labels from CoNLL-formatted level one predictions.
uv run gelato predict --help
Usage: gelato predict [OPTIONS] TEST_PATH MODEL
Load a DSPy-optimized program to predict level two labels from CoNLL-
formatted level one predictions
Arguments:
TEST_PATH Path to CoNLL-formatted test dataset [required]
MODEL LLM to prompt as a HuggingFace ID
e.g. 'Qwen/Qwen3-32B' [required]
Options:
--abstraction-path PATH Path to optimized Abstraction program [required]
--act-path PATH Path to optimized Act program [required]
--class-path PATH Path to optimized Class program [required]
--document-path PATH Path to optimized Document program [required]
--organization-path PATH Path to optimized Organization program
[required]
--person-path PATH Path to optimized Person program [required]
--output-path PATH Output path for serialized predictions
[required]
--window INTEGER The left-right context window to provide the LLM
for each mention [default: 50]
--base-url TEXT URL endpoint for an OpenAI-compatible LLM chat
server e.g. 'http://localhost:8000/v1'
[default: http://localhost:8000/v1]
--api-key TEXT API key for OpenAI LLM endpoint. Defaults to
'LOCAL' for self-hosted models that do not require
authentication. [default: LOCAL]
--help Show this message and exit.
fine-tune
The fine-tune command simplifies fine-tuning a HuggingFace Transformer
using wandb.
uv run gelato fine-tune --help
Usage: gelato fine-tune [OPTIONS] TRAIN_PATH TEST_PATH MODEL
Fine-tune a HuggingFace Transformer using `wandb`
Arguments:
TRAIN_PATH Path to CoNLL-formatted train dataset [required]
TEST_PATH Path to CoNLL-formatted test dataset [required]
MODEL Model to fine-tune as a HuggingFace ID e.g. 'FacebookAI/xlm-
roberta-base'. Assumes model is compatible with HuggingFace
transformers. [required]
Options:
--output-dir PATH output directory for wandb logs [required]
--wandb-project TEXT Name of wandb project to track sweeps e.g. 'gelato'
[default: gelato]
--sweeps INTEGER RANGE Number of wandb sweeps to perform
[default: 1; 1<=x<=64]
--help Show this message and exit.
train-model
Train the desired HuggingFace-compatible transformer model with the provided parameters
uv run gelato train-model --help
Usage: gelato train-model [OPTIONS] MODEL_ID
Train the desired model with the provided parameters.
Arguments:
MODEL_ID The HuggingFace model id of the model to train
e.g.'google-bert/bert-base-cased' [required]
Options:
--train-path TEXT The path to the training dataset e.g.
'data/train.conll' [required]
--dev-path TEXT The path to the dev dataset e.g. 'data/dev.conll'
[required]
--learning-rate FLOAT Learning rate of the model e.g. '0.003' [required]
--batch-size INTEGER Learning and eval batch size e.g. '16' [required]
--epochs INTEGER Number of training epochs e.g. '42' [required]
--weight-decay FLOAT Training weight decay e.g. '0.3' [required]
--warmup-ratio FLOAT Training warmup ratio e.g. '0.1' [required]
--output-dir TEXT output directory for wandb logs [required]
--help Show this message and exit.
score
Score a model on the datset at the provided path.
uv run gelato score --help
Usage: gelato score [OPTIONS] DATASET_PATH MODEL
Score a model on the datset at the provided path
Arguments:
DATASET_PATH Path to CoNLL-formatted dataset to evaluate [required]
MODEL Model to test as a HuggingFace ID e.g.
'Wollaston/gelato-roberta-large' [required]
Options:
--help Show this message and exit.
align
Align predictions Applies first label wins strategy for aggregation of text and labels. Useful as non-word-based tokenizers sometimes struggle to rebuild and aggregate certain words.
uv run gelato align --help
Usage: gelato align [OPTIONS] PREDICTIONS_PATH REFERENCE_PATH
Align predictions with tokens if the tokenizer aggregation pipeline fails.
Applies first label wins strategy for aggregation of text and labels. Useful
as non-word-based tokenizers sometimes struggle to rebuild and aggregate
certain words.
Arguments:
PREDICTIONS_PATH Path to CoNLL-formatted predictions to align [required]
REFERENCE_PATH Path to CoNLL-formatted reference data to align tokens to
[required]
Options:
--help Show this message and exit.
confusion
Generate confusion matrices from CoNLL-formatted predictions and their reference counterpart
uv run gelato confusion --help
Usage: gelato confusion [OPTIONS] PREDICTIONS REFERENCES OUTPUT_PATH
Generate confusion matrices from CoNLL-formatted predictions and their
reference counterpart
Arguments:
PREDICTIONS Path to CoNLL-formatted predictions [required]
REFERENCES Path to CoNLL-formatted references [required]
OUTPUT_PATH Path to save generated confusion matrix [required]
Options:
--help Show this message and exit.
Checkpoints
We released our gelato checkpoints on HuggingFace:
Data
All gelato data, including level one and two splits, as well as original annotation data,
can be found in the data/ folder.
We have also uploaded our data to HuggingFace. The level one and level two datasets are organized as subsets on HuggingFace, and each subset has its train, dev, and test splits.
Optimizers
The final DSPy optimizers can be found in the optimizers/ folder.
Scores
The CoNLL-formatted files for our reported scores can be found in the scores/ folder.
Citing GELATO
If you use our work in your research, please give us a cite:
@misc{flynn2026gelatodatasetlegislativener,
title={The GELATO Dataset for Legislative NER},
author={Matthew Flynn and Timothy Obiso and Sam Newman},
year={2026},
eprint={2603.14130},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.14130},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gelato_ner-0.1.1.tar.gz.
File metadata
- Download URL: gelato_ner-0.1.1.tar.gz
- Upload date:
- Size: 16.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c0277ac5e1374e4a4cae9169c1185fc4ccb8792a20660eab91497e96e18ef5e
|
|
| MD5 |
d43001dd5c0a65e46473cd8e0940d3b1
|
|
| BLAKE2b-256 |
b2258e9bb4cce1b9a92e90838e2324df787e46a09b3255a7cdff624ba2f5fc6d
|
File details
Details for the file gelato_ner-0.1.1-py3-none-any.whl.
File metadata
- Download URL: gelato_ner-0.1.1-py3-none-any.whl
- Upload date:
- Size: 24.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9e87655a58a7f462e8673b7799c2c90ee38ca88a103722546eb50f57042c6de
|
|
| MD5 |
b7777a1982e53908717de8863123545b
|
|
| BLAKE2b-256 |
3b76a90b166aec4125cfb2cb96dc2474f1c484ae8af4e4d996a85c2770f14dc4
|