Generates word statistics
Project description
Monogatari
Installation
To install:
pip install monogatari
Quick start
General usage
Any dictionary file defined with the format:
%
Category_key_1 Category_value_1
Category_key_2 Category_value_2
Category_key_3 Category_value_3
Category_key_4 Category_value_4
%
Word_key_1 Category_key_1 Category_key_2
Word_key_2 Category_key_1
Word_key_3 Category_key_2
Word_key_4 Category_key_1 Category_key_3
Word_key_5 Category_key_1 Category_key_2 Category_key_4
Word_key_6 Category_key_4
Where:
- Each category key must be a number.
- Each word_key must be separated by a
tabin between category keys. - The match is not case sensitive, that is, uppercase or lowercase won't make a difference in word_keys.
- Word_keys can contain whitespaces (ex: "kind of"), that's why it's important to separate the word_key and category_keys with a
tab.
from monogatari import DictCounter
counter = DictCounter("dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
counter.words_found() # This will return all categories and for each, a list of words belonging to that category
# {"category_A": ["I", "am", "here"], "category_B": ["something"], ...}
counter.reset() # This will reset the count, so that you can count again using another list of words
Pre-loaded MFD dictionary
The following example uses a pre-loaded MFD dictionary (from https://www.moralfoundations.org/).
from monogatari import MFDCounter
counter = MFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
Pre-loaded JMFD dictionary
The following example uses a pre-loaded JMFD dictionary (from https://github.com/soramame0518/j-mfd).
from monogatari import JMFDCounter
counter = JMFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
LIWC example
The library is also compatible with the LIWC dictionary.
from monogatari import DictCounter
counter = DictCounter("liwc_dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file monogatari-2.1.1.tar.gz.
File metadata
- Download URL: monogatari-2.1.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39d766095dbbf5a352b65cb35db2f27ec0229324624e69227bad5b1d220c801d
|
|
| MD5 |
3f0dcfa41609cf9ff60ffd7cb1915bbb
|
|
| BLAKE2b-256 |
3fdf4b5dc60fd1b6608c16a90e4e1f065fe0b71033a9a5eec06197db01f189cc
|
File details
Details for the file monogatari-2.1.1-py3-none-any.whl.
File metadata
- Download URL: monogatari-2.1.1-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab1c9940d1575a0681935fa51a3f100a8466d2acc05cb1369faadbf2853b17ca
|
|
| MD5 |
aab6487534cd8c35f0f19d76c93400db
|
|
| BLAKE2b-256 |
edc9dd7884c436767eba71040bb2dede51cc136c4ed11b97d8b29981f1757835
|