Generates word statistics
Project description
Monogatari
Installation
To install:
pip install monogatari
Quick start
General usage
Any dictionary file defined with the format:
%
Category_key_1 Category_value_1
Category_key_2 Category_value_2
Category_key_3 Category_value_3
Category_key_4 Category_value_4
%
Word_key_1 Category_key_1 Category_key_2
Word_key_2 Category_key_1
Word_key_3 Category_key_2
Word_key_4 Category_key_1 Category_key_3
Word_key_5 Category_key_1 Category_key_2 Category_key_4
Word_key_6 Category_key_4
Where:
- Each category key must be a number.
- Each word_key must be separated by a
tab
in between category keys. - The match is not case sensitive, that is, uppercase or lowercase won't make a difference in word_keys.
- Word_keys can contain whitespaces (ex: "kind of"), that's why it's important to separate the word_key and category_keys with a
tab
.
from monogatari import DictCounter
counter = DictCounter("dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
counter.words_found() # This will return all categories and for each, a list of words belonging to that category
# {"category_A": ["I", "am", "here"], "category_B": ["something"], ...}
counter.reset() # This will reset the count, so that you can count again using another list of words
Pre-loaded MFD dictionary
The following example uses a pre-loaded MFD dictionary (from https://www.moralfoundations.org/).
from monogatari import MFDCounter
counter = MFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
Pre-loaded JMFD dictionary
The following example uses a pre-loaded JMFD dictionary (from https://github.com/soramame0518/j-mfd).
from monogatari import JMFDCounter
counter = JMFDCounter()
list_of_words = ['私', 'コーヒー', '好き', '友達']
counter.count(list_of_words)
counter.top(100) # List top N categories, ordered by number of words
counter.top_normalized(100) # List top N categories, ordered by number of words normalized by the total number of words
LIWC example
The library is also compatible with the LIWC dictionary.
from monogatari import DictCounter
counter = DictCounter("liwc_dictionary_file.dic") # Load the dictionary
list_of_words = ["My", "list", "of", "words"]
counter.count(list_of_words) # This will count the words in each category
counter.top(100) # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}
counter.top_normalized(100) # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
monogatari-2.0.1.tar.gz
(11.5 kB
view details)
Built Distribution
File details
Details for the file monogatari-2.0.1.tar.gz
.
File metadata
- Download URL: monogatari-2.0.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b09f82f74497c4723a2a28199e9c4bc906cb5e74a277ed129507ca1f11afbac5 |
|
MD5 | 01cd04a5e9b497afddd966537cadf995 |
|
BLAKE2b-256 | 7318d917c2249017a1547e92ba071c5a00291c73ecdc38e57f01264f7379c031 |
File details
Details for the file monogatari-2.0.1-py3-none-any.whl
.
File metadata
- Download URL: monogatari-2.0.1-py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f0c1d408de2ea75f23022d6e9e110141d2a26294f2f750c0a5381f06d10dd362 |
|
MD5 | 86b00572b02966bcc0aca9669736ad45 |
|
BLAKE2b-256 | 14874f05b62f689b2efc168711283632ef6c6f28b6bdff27c3781f83f343de9d |