Skip to main content

CWordTM - Low-Code Topic Modeling Toolkit

Project description

A topic modeling toolkit on the Holy Scripture and other text from low-code to pro-code

Installation

$ pip install cwordtm

Usage

cwordtm can be used to perform some NLP pre-processing tasks, text exploration, including Chinese one, text visualization (word cloud), and topic modeling (BERTopic, LDA and NMF) as follows:

from cwordtm import meta, util, ta, tm, viz, pivot, quot

version Submodule

Provides some version information.

import cwordtm
print(cwordtm.__version__)

meta Submodule

Provides extracting source code of cwordtm module and adding timing and code-showing features to all functions of the module.

print(meta.get_module_info())

print(meta.get_module_info(detailed=True))

quot Submodule

Provides functions to extract the quotation source Scripture in OT based on the presribed NT Scripture.

cdf = util.load_word('cuv.csv')
crom8 = util.extract2(cdf, 'Rom 8')

quot.show_quot(crom8, lang='chi')

pivot Submodule

Provides a pivot table of the prescribed text.

cdf = util.load_word('cuv.csv')

pivot.stat(cdf, chi=True)

ta Submodule

Provides text analytics functions, including extracting the summarization of the prescribed text.

cdf = util.load_word('cuv.csv')
crom8 = util.extract2(cdf, 'Rom 8')

ta.summary_chi(crom8)

tm Submodule

Provides text modeling functions, including LDA, NMF and BERTopics modeling.

lda = tm.lda_process("web.csv", eval=True, timing=True)

nmf = tm.nmf_process("web.csv", eval=True, code=1)

btm = tm.btm_process("cuv.csv", chi=True, cat='ot', eval=True)

btm = tm.btm_process("cuv.csv", chi=True, cat='nt', eval=True, code=2)

util Submodule

Provides loading text and text preprocessing functions.

df = util.load_word()
cdf = util.load_word('cuv.csv')

df.head()
cdf.head()

rom8 = util.extract2(df, 'Rom 8')
crom8 = util.extract2(cdf, 'Rom 8')

viz Submodule

Wordcloud plotting from the prescribed text.

cdf = util.load_word('cuv.csv')

viz.chi_wordcloud(cdf)

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

cwordtm was created by Johnny Cheng. It is licensed under the terms of the MIT license.

Credits

cwordtm was created under the guidance of Jehovah, the Lord.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cwordtm-0.5.7.tar.gz (18.2 MB view details)

Uploaded Source

Built Distribution

cwordtm-0.5.7-py3-none-any.whl (18.2 MB view details)

Uploaded Python 3

File details

Details for the file cwordtm-0.5.7.tar.gz.

File metadata

  • Download URL: cwordtm-0.5.7.tar.gz
  • Upload date:
  • Size: 18.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.9

File hashes

Hashes for cwordtm-0.5.7.tar.gz
Algorithm Hash digest
SHA256 9b12181a0e56f6c36c92470bb78dc89458af97f7375e3dd8c3e72524ee449ba6
MD5 6bfefff92527a1360b6b1add267f5a74
BLAKE2b-256 1ef199139fbf9eb6fa9b98994e41b5fa9ebfa9524a5d63fd167e5dcc653aacd4

See more details on using hashes here.

File details

Details for the file cwordtm-0.5.7-py3-none-any.whl.

File metadata

  • Download URL: cwordtm-0.5.7-py3-none-any.whl
  • Upload date:
  • Size: 18.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.9

File hashes

Hashes for cwordtm-0.5.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f927dd9823406c1c40d1d1819d7668a9613820b9187c9c780c61f95f361d560c
MD5 881a47512d85278af4cea3b6d1582fb6
BLAKE2b-256 6ace0ed6188a6d1c92d317f2dc9784c26f1f0e837f880b5075aaae2de02fd76e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page