Skip to main content

Generates word statistics

Project description

Monogatari

Installation

To install:

pip install monogatari

Quick start

General usage

Any dictionary file defined with the format:

%
Category_key_1   Category_value_1
Category_key_2   Category_value_2
Category_key_3   Category_value_3
Category_key_4   Category_value_4
%

Word_key_1     Category_key_1    Category_key_2
Word_key_2     Category_key_1
Word_key_3     Category_key_2
Word_key_4     Category_key_1    Category_key_3
Word_key_5     Category_key_1    Category_key_2    Category_key_4
Word_key_6     Category_key_4

Where:

  • Each category key must be a number.
  • Each word_key must be separated by a tab in between category keys.
  • The match is not case sensitive, that is, uppercase or lowercase won't make a difference in word_keys.
  • Word_keys can contain whitespaces (ex: "kind of"), that's why it's important to separate the word_key and category_keys with a tab.
from monogatari import DictCounter

counter = DictCounter("dictionary_file.dic")  # Load the dictionary

list_of_words = ["My", "list", "of", "words"]

counter.count(list_of_words)  # This will count the words in each category

counter.top(100)  # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}

counter.top_normalized(100)  # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}

counter.words_found()  # This will return all categories and for each, a list of words belonging to that category
# {"category_A": ["I", "am", "here"], "category_B": ["something"], ...}

counter.reset()  # This will reset the count, so that you can count again using another list of words

Pre-loaded MFD dictionary

The following example uses a pre-loaded MFD dictionary (from https://www.moralfoundations.org/).

from monogatari import MFDCounter

counter = MFDCounter()

list_of_words = ['私', 'コーヒー', '好き', '友達']

counter.count(list_of_words)

counter.top(100)  # List top N categories, ordered by number of words

counter.top_normalized(100)  # List top N categories, ordered by number of words normalized by the total number of words

Pre-loaded JMFD dictionary

The following example uses a pre-loaded JMFD dictionary (from https://github.com/soramame0518/j-mfd).

from monogatari import JMFDCounter

counter = JMFDCounter()

list_of_words = ['私', 'コーヒー', '好き', '友達']

counter.count(list_of_words)

counter.top(100)  # List top N categories, ordered by number of words

counter.top_normalized(100)  # List top N categories, ordered by number of words normalized by the total number of words

LIWC example

The library is also compatible with the LIWC dictionary.

from monogatari import DictCounter

counter = DictCounter("liwc_dictionary_file.dic")  # Load the dictionary

list_of_words = ["My", "list", "of", "words"]

counter.count(list_of_words)  # This will count the words in each category

counter.top(100)  # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}

counter.top_normalized(100)  # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monogatari-2.0.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

monogatari-2.0.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file monogatari-2.0.1.tar.gz.

File metadata

  • Download URL: monogatari-2.0.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for monogatari-2.0.1.tar.gz
Algorithm Hash digest
SHA256 b09f82f74497c4723a2a28199e9c4bc906cb5e74a277ed129507ca1f11afbac5
MD5 01cd04a5e9b497afddd966537cadf995
BLAKE2b-256 7318d917c2249017a1547e92ba071c5a00291c73ecdc38e57f01264f7379c031

See more details on using hashes here.

File details

Details for the file monogatari-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: monogatari-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for monogatari-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f0c1d408de2ea75f23022d6e9e110141d2a26294f2f750c0a5381f06d10dd362
MD5 86b00572b02966bcc0aca9669736ad45
BLAKE2b-256 14874f05b62f689b2efc168711283632ef6c6f28b6bdff27c3781f83f343de9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page