Generates word statistics
Project description
Monogatari
Installation
To install:
pip install monogatari
Quick start
General usage
Any dictionary file defined with the format:
%
Category_key_1 Category_value_1
Category_key_2 Category_value_2
Category_key_3 Category_value_3
Category_key_4 Category_value_4
%
Word_key_1 Category_key_1 Category_key_2
Word_key_2 Category_key_1
Word_key_3 Category_key_2
Word_key_4 Category_key_1 Category_key_3
Word_key_5 Category_key_1 Category_key_2 Category_key_4
Word_key_6 Category_key_4
Where:
- Each category key must be a number.
- Each word_key must be separated by a
tab
in between category keys. - The match is not case sensitive, that is, uppercase or lowercase won't make a difference in word_keys.
- Word_keys can contain whitespaces (ex: "kind of"), that's why it's important to separate the word_key and category_keys with a
tab
.
from monogatari import DictCounter
counter = DictCounter("dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
counter.words_found() # This will return all categories and for each, a list of words belonging to that category
# {"category_A": ["I", "am", "here"], "category_B": ["something"], ...}
counter.reset() # This will reset the count, so that you can count again using another list of words
Pre-loaded MFD dictionary
The following example uses a pre-loaded MFD dictionary (from https://www.moralfoundations.org/).
from monogatari import MFDCounter
counter = MFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
Pre-loaded JMFD dictionary
The following example uses a pre-loaded JMFD dictionary (from https://github.com/soramame0518/j-mfd).
from monogatari import JMFDCounter
counter = JMFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
LIWC example
The library is also compatible with the LIWC dictionary.
from monogatari import DictCounter
counter = DictCounter("liwc_dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
monogatari-2.1.1.tar.gz
(11.5 kB
view details)
Built Distribution
File details
Details for the file monogatari-2.1.1.tar.gz
.
File metadata
- Download URL: monogatari-2.1.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39d766095dbbf5a352b65cb35db2f27ec0229324624e69227bad5b1d220c801d |
|
MD5 | 3f0dcfa41609cf9ff60ffd7cb1915bbb |
|
BLAKE2b-256 | 3fdf4b5dc60fd1b6608c16a90e4e1f065fe0b71033a9a5eec06197db01f189cc |
File details
Details for the file monogatari-2.1.1-py3-none-any.whl
.
File metadata
- Download URL: monogatari-2.1.1-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab1c9940d1575a0681935fa51a3f100a8466d2acc05cb1369faadbf2853b17ca |
|
MD5 | aab6487534cd8c35f0f19d76c93400db |
|
BLAKE2b-256 | edc9dd7884c436767eba71040bb2dede51cc136c4ed11b97d8b29981f1757835 |