Skip to main content

Calculates multiple readability metrics for large documents.

Project description

Readability Metrics

This project is based extremely heavily on mmautner's package. Modifications were made to support the analysis of large documents.


When using mmautner's package, the entire document needed to be passed in at once:

large_str = "...."
rd = Readability(large_str)
print('ARI: ', rd.ARI())

However, while doing an analysis of Supreme Court transcripts since 1956 across various metrics, my personal computer was not able to load and the needed documents all at once. In order to account for this, I created this package, which allows for pieces of documents to be passed in. Furthermore, the text is not stored, only the resulting calculations. Lastly, all metrics are calculated and returned each time, so individual calculations don't need to be performed.


Readability metrics can be installed from PyPi:

$ pip3 install readability-metrics


Readability metrics can be used as follows:

import metrics # import package

rdm = Readability()
rdm.analyze_text("This is a sentence.")
rdm.analyze_text("This is part of the same document.")
rdm.analyze_text("This is also part of the same document.")

# can further modify
rdm.analyze_text("This is also part of the same document.")

# can clear and start new analysis
rdm.analyze_text("This is also part of the same document.")

You can also calculate readability metrics across multiple categories. For instance, if you had a transcript, you could calculate metrics for all speakers at once:

import metrics
from collections import defaultdict

let transcript = [
    ('John George', 'Words said by John George'),
    ('Apple Dunkin', 'Words said by Apple Dunkin'),
    # ...

readability_per_speaker = defaultdict(lambda: Readability())

# Calculate readability metrics
for dialogue in transcript:

# Calculate results
for speaker in readability_per_speaker:
    dic[key] = dic[key].get_results()

# readability_per_speaker now in form:
            'ARI': 12.163787878787879,
            'FleschReadingEase': 58.2319, 'FleschKincaidGradeLevel': 11.2857,
            'GunningFogIndex': 14.5465,
            'SMOGIndex': 12.287087810503355,
            'ColemanLiauIndex': 9.5226,
            'LIX': 46.467171717171716,
            'RIX': 5.375
    # more speakers ...


Contributions are welcome. Please create a pull request or email me at Also feel free to create an issue if you need help with something.


Testing can be run with pytest. Simple navigate to the directory and run pytest.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for readability-metrics, version 1.0.0
Filename, size File type Python version Upload date Hashes
Filename, size readability_metrics-1.0.0-py2.py3-none-any.whl (7.8 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size readability-metrics-1.0.0.tar.gz (6.1 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page