Implemented thesaurus library using SOM
Project description
Thesaurus Visualization
Current supported languages are:
- English
eng - Russian
rus
How to run
Minimalistic way
Install the library:
pip install thesaurus-lib
Create an object and specify the language:
obj = Thesaurus(lang='eng')
Show output:
obj.show_map()
Run with your own foregrounds:
After you install the library and create the object do the following
- pass them to the library:
text1 = obj.read_pickle('2017')
text2 = obj.read_txt('shakespeare.txt')
text3 = obj.read_text('My foreground in string format')
- Preprocess your foreground:
texts = dict()
foreground_name = 'Physics articles 2017'
texts[foreground_name] = obj.custom_preprocessing_of_data(text1)
- Process foregrounds:
processed_foregrounds = obj.process_foreground(foreground_names, texts)
- Show output:
obj.show_map()
Use your own configurations
After installing the library create a file called 'config.cfg' in your working directory and fill the value with your own files:
[paths]
som_path =
index_path =
back_tokens_path =
back_embeds_path =
stopwords_path =
foregrounds_path =
[lang]
som_url =
embeds_url =
som_file =
index_file =
back_tokens =
back_embeds =
embeddings_file =
STOPWORDS_FILE =
model =
Note: Don't leave any empty field in config.cfg. For example if you aren't providing a som_file then delete it in your config.cfg and don't keep it in this way:
# fill it or delete it
som_file =
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file thesaurus_lib-0.1.9.tar.gz.
File metadata
- Download URL: thesaurus_lib-0.1.9.tar.gz
- Upload date:
- Size: 2.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
deb3ac65200a29f8e582f6914010420c82baf314493ca6e2c2e3e8636923e0df
|
|
| MD5 |
f896c1dde8882108b97dc0d777be60bf
|
|
| BLAKE2b-256 |
0d90a3d07fbacc98f50a10623f37f31d756d9059a3e4501e53ca224c667145e3
|
File details
Details for the file thesaurus_lib-0.1.9-py3-none-any.whl.
File metadata
- Download URL: thesaurus_lib-0.1.9-py3-none-any.whl
- Upload date:
- Size: 2.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
064f9ae3e9042ba0b211850821ab21fcfdb8c863a8b35b0fd41df5dd91486d67
|
|
| MD5 |
04b79cec441539cd2cd5cd65fef12bfe
|
|
| BLAKE2b-256 |
c87b92f58664517f82406992e2ff13dfa956a026831b5dc172de1d7fa8158555
|