Skip to main content

Score text "Readability" with popular formulas and metrics including Flesch-Kincaid, Gunning Fog, ARI, Dale Chall, SMOG, Spache and more

Project description

📗 py-readability-metrics

Travis Build Python Documentation Status wheel All Contributors MIT license

Score the readability of text using popular readability formulas and metrics including: Flesch Kincaid Grade Level, Flesch Reading Ease, Gunning Fog Index, Dale Chall Readability, Automated Readability Index (ARI), Coleman Liau Index, Linsear Write, SMOG, and SPACHE. 📗

GitHub stars Twitter URL

Install

pip install py-readability-metrics
python -m nltk.downloader punkt

Usage

from readability import Readability

r = Readability(text)

r.flesch_kincaid()
r.flesch()
r.gunning_fog()
r.coleman_liau()
r.dale_chall()
r.ari()
r.linsear_write()
r.smog()
r.spache()

*Note: text must contain >= 100 words*

Supported Metrics

Readability Metric Details and Properties

All metrics provide a score attribute. Indvidual metrics provide additional properties to increased interpretability. See details below to capture per metric details.

Note: In all examples below r is:

r = Readability(text)

Flesch-Kincaid Grade Level

The U.S. Army uses Flesch-Kincaid Grade Level for assessing the difficulty of technical manuals. The commonwealth of Pennsylvania uses Flesch-Kincaid Grade Level for scoring automobile insurance policies to ensure their texts are no higher than a ninth grade level of reading difficulty. Many other U.S. states also use Flesch-Kincaid Grade Level to score other legal documents such as business policies and financial forms.

call:

r.flesch_kincaid()

example:

fk = r.flesch_kincaid()
print(fk.score)
print(fk.grade_level)

Flesch Reading Ease

The U.S. Department of Defense uses the Reading Ease test as the standard test of readability for its documents and forms. Florida requires that life insurance policies have a Flesch Reading Ease score of 45 or greater.

call:

r.flesch()

example:

f = r.flesch()
print(f.score)
print(f.ease)
print(f.grade_levels)

Dale Chall Readability

The Dale-Chall Formula is an accurate readability formula for the simple reason that it is based on the use of familiar words, rather than syllable or letter counts. Reading tests show that readers usually find it easier to read, process and recall a passage if they find the words familiar.

call:

r.dale_chall()

example:

dc = dale_chall()
print(dc.score)
print(dc.grade_levels)

Automated Readability Index (ARI)

Unlike the other indices, the ARI, along with the Coleman-Liau, relies on a factor of characters per word, instead of the usual syllables per word. ARI is widely used on all types of texts.

call:

r.ari()

example:

ari = r.ari()
print(ari.score)
print(ari.grade_levels)
print(ari.ages)

Coleman Liau Index

The Coleman-Liau Formula usually gives a lower grade value than any of the Kincaid, ARI and Flesch values when applied to technical documents.

call:

r.coleman_liau()

example:

cl = r.coleman_liau()
print(cl.score)
print(cl.grade_level)

Gunning Fog

The Gunning fog index measures the readability of English writing. The index estimates the years of formal education needed to understand the text on a first reading. A fog index of 12 requires the reading level of a U.S. high school senior (around 18 years old).

call:

r.gunning_fog()

example:

gf = r.gunning_fog()
print(gf.score)
print(gf.grade_level)

SMOG

The SMOG Readability Formula (Simple Measure of Gobbledygook) is a popular method to use on health literacy materials.

call:

r.smog()

example:

s = r.smog()
print(s.score)
print(s.grade_level)

The original SMOG formula uses a sample of 30 sentences from the original text. However, the formula can be generalized to any number of sentences. You can use the generalized formula by passing the all_sentences=True argument to smog()

call:

r.smog(all_sentences=True)

example:

s = r.smog(all_sentences=True)
print(s.score)
print(s.grade_level)

SPACHE

The Spache Readability Formula is used for Primary-Grade Reading Materials, published in 1953 in The Elementary School Journal. The Spache Formula is best used to calculate the difficulty of text that falls at the 3rd grade level or below.

call:

r.spache()

example:

s = r.spache()
print(s.score)
print(s.grade_level)

Linsear Write

Linsear Write is a readability metric for English text, purportedly developed for the United States Air Force to help them calculate the readability of their technical manuals.

call:

r.linsear_write()

example:

lw = r.linsear_write()
print(lw.score)
print(lw.grade_level)

Contributing

Contributions are welcome!

References

License

MIT

Buy Me A Coffee

Contributors ✨

Thanks goes to these wonderful people (emoji key):


rbamos

💻 ⚠️

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py-readability-metrics-1.4.5.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

py_readability_metrics-1.4.5-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file py-readability-metrics-1.4.5.tar.gz.

File metadata

  • Download URL: py-readability-metrics-1.4.5.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0.post20200309 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for py-readability-metrics-1.4.5.tar.gz
Algorithm Hash digest
SHA256 465b7ffa1063f2448bf791dac50f9117d8c2bf06d931bbb0955606e14c4b3ddc
MD5 3942bfd9413be5fdf65cce2616131719
BLAKE2b-256 0ad376ebd719957ca127a2ad0f71a473ac14bef0e3369bd2e838836e45784d1f

See more details on using hashes here.

File details

Details for the file py_readability_metrics-1.4.5-py3-none-any.whl.

File metadata

  • Download URL: py_readability_metrics-1.4.5-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.0.0.post20200309 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for py_readability_metrics-1.4.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3ae5eaaa9b5d0de93b0ad6ab6a3bb26c518da1ce8bc6f2ff8aa3bf0e33f05777
MD5 fb50ebdc1a295a6cb0cb1be9dbd3c183
BLAKE2b-256 e2efc8724b3b13516ea5437ba32f128254012f96c4b6d2712b1befa3519bfc87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page