Skip to main content

Implemented thesaurus library using SOM

Project description

Thesaurus Visualization

Current supported languages are:

  • English eng
  • Russian rus

How to run

Minimalistic way

Install the library:

pip install thesaurus-lib

Create an object and specify the language:

obj = Thesaurus(lang='eng')

Show output:

obj.show_map()

Run with your own foregrounds:

After you install the library and create the object do the following

  1. pass them to the library:
text1 = obj.read_pickle('2017')
text2 = obj.read_txt('shakespeare.txt')
text3 = obj.read_text('My foreground in string format')
  1. Preprocess your foreground:
texts = dict()
foreground_name = 'Physics articles 2017'
texts[foreground_name] = obj.custom_preprocessing_of_data(text1)
  1. Process foregrounds:
processed_foregrounds = obj.process_foreground(foreground_names, texts)
  1. Show output:
obj.show_map()

Use your own configurations

After installing the library create a file called 'config.cfg' in your working directory and fill the value with your own files:

[paths]
som_path =
index_path =
back_tokens_path =
back_embeds_path =
stopwords_path =
foregrounds_path =

    [lang]
    som_url =
    embeds_url =
    som_file =
    index_file =
    back_tokens =
    back_embeds =
    embeddings_file =
    STOPWORDS_FILE =
    model =

Note: Don't leave any empty field in config.cfg. For example if you aren't providing a som_file then delete it in your config.cfg and don't keep it in this way:

# fill it or delete it
som_file =

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thesaurus_lib-0.1.9.tar.gz (2.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thesaurus_lib-0.1.9-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file thesaurus_lib-0.1.9.tar.gz.

File metadata

  • Download URL: thesaurus_lib-0.1.9.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for thesaurus_lib-0.1.9.tar.gz
Algorithm Hash digest
SHA256 deb3ac65200a29f8e582f6914010420c82baf314493ca6e2c2e3e8636923e0df
MD5 f896c1dde8882108b97dc0d777be60bf
BLAKE2b-256 0d90a3d07fbacc98f50a10623f37f31d756d9059a3e4501e53ca224c667145e3

See more details on using hashes here.

File details

Details for the file thesaurus_lib-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: thesaurus_lib-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for thesaurus_lib-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 064f9ae3e9042ba0b211850821ab21fcfdb8c863a8b35b0fd41df5dd91486d67
MD5 04b79cec441539cd2cd5cd65fef12bfe
BLAKE2b-256 c87b92f58664517f82406992e2ff13dfa956a026831b5dc172de1d7fa8158555

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page