Skip to main content

SmoothText is a Python library for calculating readability scores of texts and statistical information for texts in multiple languages.

Project description

SmoothText


license versions pypi downloads


SmoothText is still in alpha and there may be breaking changes.


Introduction

SmoothText is a Python library for calculating readability scores of texts and statistical information for texts in multiple languages.

The design principle of this library is to ensure high accuracy.

Requirements

Python 3.10 or higher.

External Dependencies

Library Version License Notes
NLTK >=3.9.1 Apache 2.0 Conditionally optional.
Stanza >=1.10.1 Apache 2.0 Conditionally optional.
Unidecode >=1.3.8 GNU GPLv2 Required.
Pyphen >=0.17.0 GPL 2.0+/LGPL 2.1+/MPL 1.1 Required.

Either NLTK or Stanza must be installed and used with the SmoothText library.

Features

Readability Analysis

SmoothText can calculate readability scores of text in the following languages, using the following formulas.

Formula/Language English Turkish
Flesch Reading Ease Ateşman
Flesch-Kincaid Grade Bezirci-Yılmaz
Flesch-Kincaid Grade Simplified

Notes:

  • Ateşman is the Turkish adaptation of Flesch Reading Ease. The two can be used interchangeably in the module.
  • Bezirci-Yılmaz is the Turkish adaptation of Flesch-Kincaid Grade. The two can be used interchangeably in the module.
  • Flesch-Kincaid Grade Simplified is essentially the same formula with as Flesch-Kincaid Grade, except that its constants are different.

Sentencizing, Tokenizing, and Syllabifying

SmoothText can extract sentences, words, or syllables from texts.

Reading Time

SmoothText can calculate how long would a text take to read.

Installation

You can install SmoothText via pip.

pip
install
smoothtext

Usage

Importing and Initializing the Library

SmoothText comes with three submodules: Language, ReadabilityFormula and SmoothText.

from smoothtext import Language, ReadabilityFormula, SmoothText

Before using, the library must be initialized with a static function. The following will set NLTK as the backend, and automatically download all the resources for the supported languages. Alternatively, you can use Stanza.

SmoothText.setup(backend='nltk')

Instancing

SmoothText is expected to be used with SmoothText class instances.

st = SmoothText('en')

Now, an instance is accessible via st, and it is ready to work with English texts.

Calculating Readability Scores

See the following text. Now, we will analyze it.

text = "Forrest Gump is a 1994 American comedy-drama film directed by Robert Zemeckis."

For English, we have two available formulas: Flesch Reading Ease and Flesch-Kincaid Grade. We can either call the compute_readability function, or use the instance as a callable. Either way, we are expected to pass the formula.

score_1 = st.compute_readability(text, ReadabilityFormula.Flesch_Reading_Ease)
score_2 = st(text, ReadabilityFormula.Flesch_Kincaid_Grade)

print(score_1, score_2)
# Output is: 25.455000000000013 12.690000000000001

Documentation

See here for API documentation.

Roadmap

SmoothText is still in its early stages. The immediate tasks include adding more languages and backends.

License

SmoothText has an MIT license. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smoothtext-0.0.17.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smoothtext-0.0.17-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file smoothtext-0.0.17.tar.gz.

File metadata

  • Download URL: smoothtext-0.0.17.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.9

File hashes

Hashes for smoothtext-0.0.17.tar.gz
Algorithm Hash digest
SHA256 663af5eedb4667156d930031f2512ed3d648acc9ccd1b822522bb7cff09c3e0b
MD5 94968235d99d9ed1a9e001804cfd713e
BLAKE2b-256 593cdd2080721f0c496fac974995a730610b0264647cfaa9f79e8ce47d0c3774

See more details on using hashes here.

File details

Details for the file smoothtext-0.0.17-py3-none-any.whl.

File metadata

  • Download URL: smoothtext-0.0.17-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.9

File hashes

Hashes for smoothtext-0.0.17-py3-none-any.whl
Algorithm Hash digest
SHA256 ab5befe345c3d169fc52b41b9b352c9c5b4cfa2272057be851180b842f829c6c
MD5 d9374b37a53b3b1fd31e23b5f5116f56
BLAKE2b-256 c5091d12c69ec026b582a3f7c6b66b5de37b366035d486803939c12a9b32530d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page