Skip to main content

Generates word statistics

Project description

Monogatari

Installation

To install:

pip install monogatari

Quick start

General usage

Any dictionary file defined with the format:

%
Category_key_1   Category_value_1
Category_key_2   Category_value_2
Category_key_3   Category_value_3
Category_key_4   Category_value_4
%

Word_key_1     Category_key_1    Category_key_2
Word_key_2     Category_key_1
Word_key_3     Category_key_2
Word_key_4     Category_key_1    Category_key_3
Word_key_5     Category_key_1    Category_key_2    Category_key_4
Word_key_6     Category_key_4

Where:

  • Each category key must be a number.
  • Each word_key must be separated by a tab in between category keys.
  • The match is not case sensitive, that is, uppercase or lowercase won't make a difference in word_keys.
  • Word_keys can contain whitespaces (ex: "kind of"), that's why it's important to separate the word_key and category_keys with a tab.
from monogatari import DictCounter

counter = DictCounter("dictionary_file.dic")  # Load the dictionary

list_of_words = ["My", "list", "of", "words"]

counter.count(list_of_words)  # This will count the words in each category

counter.top(100)  # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}

counter.top_normalized(100)  # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}

counter.words_found()  # This will return all categories and for each, a list of words belonging to that category
# {"category_A": ["I", "am", "here"], "category_B": ["something"], ...}

counter.reset()  # This will reset the count, so that you can count again using another list of words

Pre-loaded MFD dictionary

The following example uses a pre-loaded MFD dictionary (from https://www.moralfoundations.org/).

from monogatari import MFDCounter

counter = MFDCounter()

list_of_words = ['私', 'コーヒー', '好き', '友達']

counter.count(list_of_words)

counter.top(100)  # List top N categories, ordered by number of words

counter.top_normalized(100)  # List top N categories, ordered by number of words normalized by the total number of words

Pre-loaded JMFD dictionary

The following example uses a pre-loaded JMFD dictionary (from https://github.com/soramame0518/j-mfd).

from monogatari import JMFDCounter

counter = JMFDCounter()

list_of_words = ['私', 'コーヒー', '好き', '友達']

counter.count(list_of_words)

counter.top(100)  # List top N categories, ordered by number of words

counter.top_normalized(100)  # List top N categories, ordered by number of words normalized by the total number of words

LIWC example

The library is also compatible with the LIWC dictionary.

from monogatari import DictCounter

counter = DictCounter("liwc_dictionary_file.dic")  # Load the dictionary

list_of_words = ["My", "list", "of", "words"]

counter.count(list_of_words)  # This will count the words in each category

counter.top(100)  # This will return a list of the top 100 categories and their count
# {("category_A", 12), ("category_B", 9), ...}

counter.top_normalized(100)  # This will return a list of the top 100 categories and their normalized value
# {("category_A", 0.014), ("category_B", 0.008), ...}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monogatari-2.1.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

monogatari-2.1.1-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file monogatari-2.1.1.tar.gz.

File metadata

  • Download URL: monogatari-2.1.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for monogatari-2.1.1.tar.gz
Algorithm Hash digest
SHA256 39d766095dbbf5a352b65cb35db2f27ec0229324624e69227bad5b1d220c801d
MD5 3f0dcfa41609cf9ff60ffd7cb1915bbb
BLAKE2b-256 3fdf4b5dc60fd1b6608c16a90e4e1f065fe0b71033a9a5eec06197db01f189cc

See more details on using hashes here.

File details

Details for the file monogatari-2.1.1-py3-none-any.whl.

File metadata

  • Download URL: monogatari-2.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for monogatari-2.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ab1c9940d1575a0681935fa51a3f100a8466d2acc05cb1369faadbf2853b17ca
MD5 aab6487534cd8c35f0f19d76c93400db
BLAKE2b-256 edc9dd7884c436767eba71040bb2dede51cc136c4ed11b97d8b29981f1757835

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page