Skip to main content

Measure text comprehensibility/readability using the Mistrik formula

Project description

📚 Mistrík's measure of readability 📚

Python wheel PyPI Latest Release PyPI Downloads MIT license

What is it?

Mistrík is a pure Python library/module that scores the readability of Slovak text using Mistrík's measure of readability and comprehension. The Mistrík’s readability formula is calculated using the Phrase Repetition Index. This implies that a text becomes easier to read the more words it repeats. The metric can be used to measure the readability index (R) of Slovak texts, textbooks, research papers, and many more.  The original research by Jozef Mistrík can be found here (pp.171-178). 📑

Why we made this? 🤔

Readability measures are somewhat common in Slovakia, but not as widespread as they are abroad. Our goal was to support the use of readability measures, especially Mistrík’s, by creating an open-source Python library, since there is still no public library or tool that focuses on Slovak texts that we can freely use. 🙃 At the same time, we wanted to make this metric more accessible because improving reading comprehension skills not only improves comprehension but also supports lifelong learning by enabling individuals to effectively absorb information in a variety of areas. 📈

Description of measure 🖊️

S = average length of words in number of syllables,
V = average length of sentences in number of words,
N = number of words,
L = number of unique words,
I = word repetition index (I = N/L),
R = readability score 50 - ((S * V) / I)

Score Difficulty
50 - 40 Very Easy
40-30 Standard
30-20 Fairly Difficult
20-10 Difficult
10-0 Very Confusing

In practice, this means that a text that scores between 40 and 50 is typical for fairy tales. On the contrary, a text that achieves a score of up to 20 is suitable for experts in the field to which the text relates or for university students.

💿 Getting started - installation: 💿

pip install

📦 Import module: 📦

from mistrik import Mistrik

👩🏻‍💻 Examples of use: 🧑🏻‍💻

text = 
"""
Danka a Janka sú sestričky dvojčence a sú navlas
rovnaké. Danka má oči celkom ako Janka, hnedé a veselé
ani gaštančeky. A Janka má vlasy celkom ako Danka,
plavé a ostrihané na ofinu. Ešte aj nosy majú rovnaké:
trošku vyhrnuté a veľmi všetečné.
Danka a Janka sa rovnako aj obliekajú. Danka má
vždy taký istý kabát ako Janka a Janka také isté šaty ako
Danka. Aj čiapky a topánky majú vždy celkom rovnaké.
"""

M = Mistrik(text)
R = M.readability()
print (R)

Output:

MISTRIK MEASURE OF READABILITY:
SENTENCES: 7
SYLLABLES: 143
V: 10 (10.429)
S: 2.0 (1.959)
N: 73
L: 41
I: 1.78
R: 39 (38.523)

You can also access all variables like this:

M = Mistrik(text)
R = M.readability()
print ("Sentences:",R.SEN," Syllables:",R.SYL)
print ("The readability of the text is:", R.R)

Output:

Sentences: 7  Syllables: 143
The readability of the text is: 39

Support us 🌟

Buy Me A Coffee

License

📜 MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mistrik-1.0.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mistrik-1.0.0-py2.py3-none-any.whl (9.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file mistrik-1.0.0.tar.gz.

File metadata

  • Download URL: mistrik-1.0.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for mistrik-1.0.0.tar.gz
Algorithm Hash digest
SHA256 511d778228967752fb262a9cdb19dbc0c2df8d698244c326c1c52c75685c703f
MD5 de784bcfaf39bd47e48cbfdd1951a73d
BLAKE2b-256 d73e10260e3cbeeb7b0e09be87da2f6ef52ddf4f664cfea46a40022c35d31b4b

See more details on using hashes here.

File details

Details for the file mistrik-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: mistrik-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for mistrik-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 536003e6a44af3b8a28e1af264d98352c30a75e3d66957ba64f598fc3dba6382
MD5 a7d7f482df820d8381581e6c2b5d9f79
BLAKE2b-256 ac029c75225588e0eaf09f2f311453a8c492823b133fe5ad4b7e60e053ed1bb9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page