CLI suite for benchmarking topic models

These details have not been verified by PyPI

Project description

topic-benchmark

Command Line Interface for benchmarking topic models.

The package contains catalogue registries for all models, datasets and metrics for model evaluation, along with scripts for producing tables and figures for the S3 paper.

Usage

Installation

You can install the package from PyPI.

pip install topic-benchmark

Commands

`run`

Run the benchmark. Defaults to running all models with the benchmark used in Kardos et al. (2024).

python3 -m topic_benchmark run

Argument	Short Flag	Description	Type	Default
`--out_dir OUT_DIR`	`-o`	Output directory for the results.	`str`	`results/`
`--encoders ENCODERS`	`-e`	Which encoders should be used for conducting runs?	`str`	`None`
`--models MODELS`	`-m`	What subsection of models should the benchmark be run on.	`Optional[list[str], NoneType]`	`None`
`--datasets DATASETS`	`-d`	What datasets should the models be evaluated on.	`Optional[list[str], NoneType]`	`None`
`--metrics METRICS`	`-t`	What metrics should the models be evaluated on.	`Optional[list[str], NoneType]`	`None`
`--seeds SEEDS`	`-s`	What seeds should the models be evaluated on.	`Optional[list[int], NoneType]`	`None`

Push to hub

Push results to a HuggingFace repository.

python3 -m topic_benchmark push_to_hub "your_user/your_repo"

Argument	Description	Type	Default
`hf_repo`	HuggingFace repository to push results to.	`str`	N/A
`results_folder`	Folder containing results for all embedding models.	`str`	`results/`

Reproducing $S^3$ paper results

Result files to all runs in the $S^3$ publication can be found in the results/ folder in the repository. To reproduce the results reported in our paper, please do the following.

First, install this package by running the following command:

pip install topic-benchmark
python3 -m topic-benchmark run -o results/

The results for each embedding model will be found in the results folder (unless a value for --out_file is explicitly passed).

To produce figures and tables in the paper, you can use the scripts in the scripts/s3_paper/ folder.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.6.0

Nov 15, 2024

0.5.0

Sep 28, 2024

0.3.1

Jun 28, 2024

0.2.7

May 26, 2024

0.2.6

May 23, 2024

0.2.5

May 23, 2024

0.2.2

Mar 13, 2024

0.2.1

Mar 7, 2024

0.2.0

Mar 7, 2024

0.1.4

Mar 5, 2024

0.1.3

Mar 3, 2024

0.1.2

Feb 27, 2024

0.1.1

Feb 27, 2024

0.1.0

Feb 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

topic_benchmark-0.6.0.tar.gz (16.0 kB view details)

Uploaded Nov 15, 2024 Source

Built Distribution

topic_benchmark-0.6.0-py3-none-any.whl (23.0 kB view details)

Uploaded Nov 15, 2024 Python 3

File details

Details for the file topic_benchmark-0.6.0.tar.gz.

File metadata

Download URL: topic_benchmark-0.6.0.tar.gz
Upload date: Nov 15, 2024
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/5.15.0-124-generic

File hashes

Hashes for topic_benchmark-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`73e3b58f4b8925cfb279a92f1fd3b5cd7a3e150be558a434ec4c510f3de6adde`
MD5	`64349294faac2c1d85e04746d78809c5`
BLAKE2b-256	`1aaac306464c4660319e0004e8679b11b53d4a0f62dbcbfda41f070cc954b55c`

See more details on using hashes here.

File details

Details for the file topic_benchmark-0.6.0-py3-none-any.whl.

File metadata

Download URL: topic_benchmark-0.6.0-py3-none-any.whl
Upload date: Nov 15, 2024
Size: 23.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/5.15.0-124-generic

File hashes

Hashes for topic_benchmark-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dfc7884c3148b58c91e125ba021ca13abc83116bf07042c4dc22634fd1d99bc6`
MD5	`c57917e97c00fcd32cc951ede7064930`
BLAKE2b-256	`fdf0194187c269f17dc34552e76ca1c7621ded94bfa948c20c43ea9118f8a6f6`