Skip to main content

Measure text comprehensibility/readability using the Mistrik formula

Project description

📚 Mistrík's measure of readability 📚

Python PyPI - Version PyPI - Wheel PyPI - Status PyPI Downloads MIT license

What is it?

Mistrík is a pure Python library/module that scores the readability of Slovak text using Mistrík's measure of readability and comprehension. The Mistrík’s readability formula is calculated using the Phrase Repetition Index. This implies that a text becomes easier to read the more words it repeats. The metric can be used to measure the readability index (R) of Slovak texts, textbooks, research papers, and many more.  The original research by Jozef Mistrík can be found here (pp.171-178). 📑

Why we made this? 🤔

Readability measures are somewhat common in Slovakia, but not as widespread as they are abroad. Our goal was to support the use of readability measures, especially Mistrík’s, by creating an open-source Python library, since there is still no public library or tool that focuses on Slovak texts that we can freely use. 🙃 At the same time, we wanted to make this metric more accessible because improving reading comprehension skills not only improves comprehension but also supports lifelong learning by enabling individuals to effectively absorb information in a variety of areas. 📈

Description of measure 🖊️

S = average length of words in number of syllables,
V = average length of sentences in number of words,
N = number of words,
L = number of unique words,
I = word repetition index (I = N/L),
R = readability score 50 - ((S * V) / I)

Score Difficulty
50 - 40 Very Easy
40-30 Standard
30-20 Fairly Difficult
20-10 Difficult
10-0 Very Confusing

In practice, this means that a text that scores between 40 and 50 is typical for fairy tales. On the contrary, a text that achieves a score of up to 20 is suitable for experts in the field to which the text relates or for university students.

💿 Getting started - installation: 💿

pip install mistrik

📦 Import module: 📦

from mistrik import Mistrik

👩🏻‍💻 Examples of use: 🧑🏻‍💻

text = 
"""
Danka a Janka sú sestričky dvojčence a sú navlas
rovnaké. Danka má oči celkom ako Janka, hnedé a veselé
ani gaštančeky. A Janka má vlasy celkom ako Danka,
plavé a ostrihané na ofinu. Ešte aj nosy majú rovnaké:
trošku vyhrnuté a veľmi všetečné.
Danka a Janka sa rovnako aj obliekajú. Danka má
vždy taký istý kabát ako Janka a Janka také isté šaty ako
Danka. Aj čiapky a topánky majú vždy celkom rovnaké.
"""

M = Mistrik(text)
R = M.readability()
print (R)

Output:

MISTRIK MEASURE OF READABILITY:
SENTENCES: 7
SYLLABLES: 143
V: 10 (10.429)
S: 2.0 (1.959)
N: 73
L: 41
I: 1.78
R: 39 (38.523)

You can also access all variables like this:

M = Mistrik(text)
R = M.readability()
print ("Sentences:",R.SEN," Syllables:",R.SYL)
print ("The readability of the text is:", R.R)

Output:

Sentences: 7  Syllables: 143
The readability of the text is: 39

Support us 🌟

Buy Me A Coffee

License

📜 MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mistrik-1.0.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

mistrik-1.0.1-py2.py3-none-any.whl (9.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file mistrik-1.0.1.tar.gz.

File metadata

  • Download URL: mistrik-1.0.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for mistrik-1.0.1.tar.gz
Algorithm Hash digest
SHA256 299bffc5cbc3dd1e1c0f03028697bdaab81302730e92b3bfbddaeaf21bddbae9
MD5 68cd0145602b8e09c545f1ec47b6336b
BLAKE2b-256 eea1060b9f38952a25599b778ca75f631b8add2213f6eb16c6a05c4e2f847eb5

See more details on using hashes here.

File details

Details for the file mistrik-1.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: mistrik-1.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for mistrik-1.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 dec39e308925a8e8af60337b14801737b8d82caef4751be137fab1bc46f1e443
MD5 164b3da825dc0931e0731a976f24fbe4
BLAKE2b-256 a530699b3d0305d4e1cf7899b75c065a7f6ec01480f9d18aaa3201ebd2b07a7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page