Skip to main content

spaCy pipeline component for adding text readability meta data to Doc objects.

Project description

spacy_readability

spaCy v2.0 pipeline component for calculating readability scores of of text. Provides scores for Flesh-Kincaid grade level, Flesh-Kincaid reading ease, Dale-Chall, and SMOG.

Installation

pip install spacy-readability

Usage

import spacy
from spacy_readability import Readability

nlp = spacy.load('en')
read = Readability(nlp)
nlp.add_pipe(read, last=True)

doc = nlp("I am some really difficult text to read because I use obnoxiously large words.")

print(doc._.flesch_kincaid_grade_level)
print(doc._.flesch_kincaid_reading_ease)
print(doc._.dale_chall)
print(doc._.smog)
print(doc._.coleman_liau_index)
print(doc._.automated_readability_index)
print(doc._.forcast)

Readability Scores

Readability is the ease with which a reader can understand a written text. In natural language, the readability of text depends on its content (the complexity of its vocabulary and syntax) and its presentation (such as typographic aspects like font size, line height, and line length).

Popular Metrics

  • The Flesch formulas : - Flesch-Kincaid Readability Score

    • Flesch-Kincaid Reading Ease
  • Dale-Chall formula

  • SMOG

  • Coleman-Liau Index

  • Automated Readability Index

  • FORCAST

For more in depth reading.

Contributing

Setup

  1. Install Poetry
  2. Run make setup to prepare workspace

Testing

  1. Run make test to run all tests

Linting

  1. Run make format to run black code formatter
  2. Run make lint to run pylint
  3. Run make mypy to run mypy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy_readability-1.4.1.tar.gz (14.4 kB view hashes)

Uploaded Source

Built Distribution

spacy_readability-1.4.1-py3-none-any.whl (49.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page