tm-eval·PyPI

Topic Modeling Evaluation

These details have not been verified by PyPI

Project links

Project description

Topic Modeling Evaluation

A toolkit to quickly evaluate model goodness over number of topics

Metrics

Coherence measure to be used.

Fastest method - 'u_mass', 'c_uci' also known as c_pmi.
For 'u_mass' corpus should be provided, if texts is provided, it will be converted to corpus using the dictionary.
For 'c_v', 'c_uci' and 'c_npmi' texts should be provided (corpus isn't needed)

Examples

Example 1: estimate metrics for one topic model with specific number of topics

from tm_eval import *
# load a dictionary with document key and its term list split by ','.
input_file = "datasets/covid19_symptoms.pickle"
output_folder = "outputs"
model_name = "symptom"
num_topics = 10
# run
results = evaluate_all_metrics_from_lda_model(input_file=input_file, 
                                              output_folder=output_folder,
                                              model_name=model_name, 
                                              num_topics=num_topics)
print(results)

Example 2: find model goodness change over number of topics

from tm_eval import *
if __name__=="__main__":
    # start configure
    # load a dictionary (key,value) with document id as key and its term list combined by ',' as value.
    input_file = "datasets/covid19_symptoms.pickle"
    output_folder = "outputs"
    model_name = "symptom"
    start=2
    end=5
    # end configure
    # run and explore

    list_results = explore_topic_model_metrics(input_file=input_file, 
                                               output_folder=output_folder,
                                               model_name=model_name,
                                               start=start,
                                               end=end)
    # summarize results
    show_topic_model_metric_change(list_results,save=True,
                                   save_path=f"{output_folder}/metrics.csv")

    # plot metric changes
    plot_tm_metric_change(csv_path=f"{output_folder}/metrics.csv",
                          save=True,save_folder=output_folder)

Output results

c_v

u_mass

c_npmi

c_uci

License

The tm-eval toolkit is provided by Donghua Chen with MIT License.

References

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.2

Jun 24, 2022

0.0.1

Apr 24, 2022

0.0.1a1 pre-release

Apr 23, 2022

0.0.1a0 pre-release

Apr 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tm-eval-0.0.2.tar.gz (11.9 kB view details)

Uploaded Jun 24, 2022 Source

Built Distribution

tm_eval-0.0.2-py3-none-any.whl (9.0 kB view details)

Uploaded Jun 24, 2022 Python 3

File details

Details for the file tm-eval-0.0.2.tar.gz.

File metadata

Download URL: tm-eval-0.0.2.tar.gz
Upload date: Jun 24, 2022
Size: 11.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for tm-eval-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`8b57ab1e2c29c69826c38b545afbbe1637af9de242c91497c41199375b4d7026`
MD5	`cac50262e18b839e5e942bab2bb37f74`
BLAKE2b-256	`0833d9efe353a5e2216eb045b6d5efa74cd4ba09077a8ae6161757d6c4a320fc`

See more details on using hashes here.

File details

Details for the file tm_eval-0.0.2-py3-none-any.whl.

File metadata

Download URL: tm_eval-0.0.2-py3-none-any.whl
Upload date: Jun 24, 2022
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for tm_eval-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0cfcc1bb6b171d5240fa531b3d022da75f5a33d4779ffe31be861e97a73979f1`
MD5	`75f574b7a41c4d714ef6985c00ec3ea6`
BLAKE2b-256	`871de6fec53d02ab08be20db22400ab6c1aca63111009500d034eaa898554957`

See more details on using hashes here.

tm-eval 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Topic Modeling Evaluation

Metrics

Examples

Output results

License

References

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes