Benchmarking topic models for a paper
Project description
topic-benchmark
Just Benchmarking Topic Models :)
Todo:
- Run benchmark with these models and upload the results:
- all-MiniLM-L6-v2 ⌛
- all-mpnet-base-v2 ⌛
- sentence-transformers/average_word_embeddings_glove.6B.300d ⌛
- intfloat/e5-large-v2 (OR intfloat/multilingual-e5-large-instruct, to my knowledge, they are the same size, but this one performs way better on MTEB)
- Implement pretty printing and formatting to Latex and MD tables for results.
- (Maybe) Implement speed tracking.
Usage:
pip install topic-benchmark
python3 -m topic_benchmark run -e "embedding_model_name"
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
topic_benchmark-0.2.0.tar.gz
(9.1 kB
view hashes)
Built Distribution
Close
Hashes for topic_benchmark-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7447673dbe73061b8190c61fa905bdbd2f598dbe124346d0702d012ee6b4290f |
|
MD5 | c68c5849cad377076a4e979b0785a8ff |
|
BLAKE2b-256 | 957367772e36c821e8da6ff79fdff9bc63972aabfdc2cd1873f610aeec4c54c2 |