Pure-Python text analysis: readability, vocabulary richness, sentiment, n-grams

These details have not been verified by PyPI

Project links

Project description

textstat-py

Text analysis for Python. Zero dependencies.

NLTK is 40MB and requires a corpus download just to tokenize a sentence. textblob pulls in NLTK. spaCy needs a 50MB model file before it'll tell you anything. For most text analysis tasks — readability scores, vocabulary stats, sentiment, writing quality signals — none of that weight is necessary.

pip install textstat-py

What it does

$ textstat essay.txt
=== Text Statistics: essay.txt ===
  Words            : 1243
  Sentences        : 67
  Reading time     : 6.2 min
  Flesch ease      : 58.4  (0=hard, 100=easy)
  FK grade level   : 11.2
  Grade consensus  : 10.8  (avg of 4 formulas)
  Lexical diversity: 0.71  (unique/total words)
  Sentiment        : neutral  (polarity=0.02)
  Passive voice    : 0.18  (fraction of sentences)
  Adverb density   : 0.031  (>0.05 may signal weak verbs)
  Top words        : data(18), model(14), training(11), loss(9), layer(7)

Compare two versions of the same document:

$ textstat --compare draft.txt final.txt
Metric                      A: draft.txt          B: final.txt          Delta
------------------------------------------------------------------------
Words                       1891                  1243                  -648
Reading time (min)          9.46                  6.21                  -3.25
Flesch ease                 44.1                  58.4                  +14.3
Grade level                 13.2                  10.8                  -2.4
Passive voice ratio         0.31                  0.18                  -0.13
Adverb density              0.071                 0.031                 -0.04

Install

pip install textstat-py

Python 3.8+. No dependencies. Single file.

CLI

textstat document.txt          # full report
textstat --json document.txt   # JSON output
textstat --wpm 250 document.txt  # custom reading speed
textstat --compare before.txt after.txt  # side-by-side diff
cat text.txt | textstat        # stdin

Python API

from textstat import analyze, flesch_reading_ease, grade_level_consensus

text = open("essay.txt").read()

# Quick scores
print(flesch_reading_ease(text))    # 58.4
print(grade_level_consensus(text))  # 10.8

# Full analysis dict
stats = analyze(text)
print(stats["passive_voice_ratio"])  # 0.18
print(stats["adverb_density"])       # 0.031
print(stats["top_words"])            # [("data", 18), ("model", 14), ...]

Functions

Readability

flesch_reading_ease(text) — 0–100, higher = easier
flesch_kincaid_grade(text) — US grade level
gunning_fog(text) — years of education needed
coleman_liau_index(text)
automated_readability_index(text)
smog_index(text)
grade_level_consensus(text) — mean across all grade formulas

Writing quality

passive_voice_ratio(text) — fraction of sentences with passive constructions
adverb_density(text) — fraction of words that are -ly adverbs (>0.05 is a signal)

Vocabulary

lexical_diversity(text) — type-token ratio
mattr(text, window=100) — moving-average TTR, stable for long texts
herdan_c(text), yule_k(text) — length-robust vocabulary richness
hapax_legomena_ratio(text) — fraction of words appearing exactly once
vocabulary_richness(text) — all of the above as a dict

Counts & structure

count_words(text), count_sentences(text), count_paragraphs(text)
reading_time(text, wpm=200)
sentence_stats(text) — min/max/mean/median sentence length
paragraph_stats(text) — word counts per paragraph

Sentiment

sentiment_polarity(text) — −1.0 to +1.0, lexicon-based, no model needed
sentiment_label(text) — "positive" / "neutral" / "negative"

N-grams

top_ngrams(text, n=2, k=10) — most frequent n-grams
ngram_diversity(text, n=2) — unique n-grams / total positions
ngram_stats(text) — bigrams + trigrams bundled

Misc

top_words(text, n=10) — most frequent non-stopword words
word_frequency_distribution(text) — total tokens, unique types, Zipf fit
text_density(text) — content words / total words

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Apr 16, 2026

0.1.1

Apr 6, 2026

0.1.0

Apr 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textstat_py-0.2.0.tar.gz (19.1 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

textstat_py-0.2.0-py3-none-any.whl (11.0 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file textstat_py-0.2.0.tar.gz.

File metadata

Download URL: textstat_py-0.2.0.tar.gz
Upload date: Apr 16, 2026
Size: 19.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for textstat_py-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`b6235b3d06b561f44bd3097ffd77d1b7bcd01fb0dcf201bd283b80dec894e5df`
MD5	`98e8da3b46b0f068ec554a96cf210c90`
BLAKE2b-256	`499a607cc81bdd2d1b3c74f939862172e476892be6f81ad2ef11525e3756adc1`

See more details on using hashes here.

File details

Details for the file textstat_py-0.2.0-py3-none-any.whl.

File metadata

Download URL: textstat_py-0.2.0-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 11.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for textstat_py-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`de27a2732a75550aaeac0c6b2292ce1bd0ba9d6db14eab1732f63d6b60da6af1`
MD5	`8e522b280bc051566af9d909e04e83e4`
BLAKE2b-256	`b32235dcd41a80fcb52b3f1caeb85b28320a730ce910262ffa01a472211c517a`

See more details on using hashes here.

textstat-py 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

textstat-py

What it does

Install

CLI

Python API

Functions

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes