SmoothText is a Python library for calculating readability scores of texts and statistical information for texts in multiple languages.
Project description
SmoothText
SmoothText is still in alpha and there may be breaking changes.
Introduction
SmoothText is a Python library for calculating readability scores of texts and statistical information for texts in multiple languages.
The design principle of this library is to ensure high accuracy.
Requirements
Python 3.10 or higher.
External Dependencies
| Library | Version | License | Notes |
|---|---|---|---|
| NLTK | >=3.9.1 |
Apache 2.0 |
Conditionally optional. |
| Stanza | >=1.10.1 |
Apache 2.0 |
Conditionally optional. |
| Unidecode | >=1.3.8 |
GNU GPLv2 |
Required. |
| Pyphen | >=0.17.0 |
GPL 2.0+/LGPL 2.1+/MPL 1.1 |
Required. |
Either NLTK or Stanza must be installed and used with the SmoothText library.
Features
Readability Analysis
SmoothText can calculate readability scores of text in the following languages, using the following formulas.
| Formula/Language | English | Turkish |
|---|---|---|
| Flesch Reading Ease | ✔ | ✔ Ateşman |
| Flesch-Kincaid Grade | ✔ | ✔ Bezirci-Yılmaz |
| Flesch-Kincaid Grade Simplified | ✔ | ❌ |
Notes:
- Ateşman is the Turkish adaptation of Flesch Reading Ease. The two can be used interchangeably in the module.
- Bezirci-Yılmaz is the Turkish adaptation of Flesch-Kincaid Grade. The two can be used interchangeably in the module.
- Flesch-Kincaid Grade Simplified is essentially the same formula with as Flesch-Kincaid Grade, except that its constants are different.
Sentencizing, Tokenizing, and Syllabifying
SmoothText can extract sentences, words, or syllables from texts.
Reading Time
SmoothText can calculate how long would a text take to read.
Installation
You can install SmoothText via pip.
pip
install
smoothtext
Usage
Importing and Initializing the Library
SmoothText comes with three submodules: Language, ReadabilityFormula and SmoothText.
from smoothtext import Language, ReadabilityFormula, SmoothText
Before using, the library must be initialized with a static function. The following will set NLTK as the backend, and automatically download all the resources for the supported languages. Alternatively, you can use Stanza.
SmoothText.setup(backend='nltk')
Instancing
SmoothText is expected to be used with SmoothText class instances.
st = SmoothText('en')
Now, an instance is accessible via st, and it is ready to work with English texts.
Calculating Readability Scores
See the following text. Now, we will analyze it.
text = "Forrest Gump is a 1994 American comedy-drama film directed by Robert Zemeckis."
For English, we have two available formulas: Flesch Reading Ease and Flesch-Kincaid Grade. We can either call the
compute_readability function, or use the instance as a callable. Either way, we are expected to pass the formula.
score_1 = st.compute_readability(text, ReadabilityFormula.Flesch_Reading_Ease)
score_2 = st(text, ReadabilityFormula.Flesch_Kincaid_Grade)
print(score_1, score_2)
# Output is: 25.455000000000013 12.690000000000001
Documentation
See here for API documentation.
Roadmap
SmoothText is still in its early stages. The immediate tasks include adding more languages and backends.
License
SmoothText has an MIT license. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smoothtext-0.0.17.tar.gz.
File metadata
- Download URL: smoothtext-0.0.17.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
663af5eedb4667156d930031f2512ed3d648acc9ccd1b822522bb7cff09c3e0b
|
|
| MD5 |
94968235d99d9ed1a9e001804cfd713e
|
|
| BLAKE2b-256 |
593cdd2080721f0c496fac974995a730610b0264647cfaa9f79e8ce47d0c3774
|
File details
Details for the file smoothtext-0.0.17-py3-none-any.whl.
File metadata
- Download URL: smoothtext-0.0.17-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab5befe345c3d169fc52b41b9b352c9c5b4cfa2272057be851180b842f829c6c
|
|
| MD5 |
d9374b37a53b3b1fd31e23b5f5116f56
|
|
| BLAKE2b-256 |
c5091d12c69ec026b582a3f7c6b66b5de37b366035d486803939c12a9b32530d
|